-
Building a Transformer Encoder from Scratch
Implement a single layer of a **Transformer Encoder** from scratch, without using `torch.nn.TransformerEncoderLayer`. This requires implementing a multi-head self-attention module and a...
-
Implementing a Siamese Network for Similarity Learning
Build and train a **Siamese network** on a dataset like MNIST. The network takes pairs of images as input and learns to determine if they belong to the same class (a positive pair) or different...
-
Generative Adversarial Network (GAN) on MNIST
Implement and train a simple **Generative Adversarial Network (GAN)**. The network consists of a generator and a discriminator. The generator takes a random noise vector and tries to generate a...
-
Distributed Data Parallel Training
Set up a **distributed data parallel training** script using `torch.nn.parallel.DistributedDataParallel` and `torch.distributed`. You'll need to use `torch.multiprocessing.spawn` to launch...
-
Implementing a Graph Neural Network (GNN) Layer
Implement a simple **Graph Neural Network (GNN) layer** for node classification on a small graph. The layer should aggregate features from a node's neighbors and combine them with the node's own...
-
Building a Neural Ordinary Differential Equation (NODE) Layer
Implement a simple **Neural Ordinary Differential Equation (NODE) layer**. This involves defining a `torch.nn.Module` that represents the derivative function $f(t, z(t))$ and then using...
-
Implementing a Simple VAE for Text (Sentence VAE)
Implement a **Variational Autoencoder (VAE)** for text, often called a Sentence VAE. The encoder will be an RNN (e.g., GRU) that outputs a latent distribution, and the decoder will be another RNN...
-
Differentiating Through a Non-differentiable Function with `torch.autograd.Function`
Implement a **custom `torch.autograd.Function`** for a non-differentiable operation, such as a custom quantization function. The `forward` method will perform the non-differentiable operation, and...
-
Batch Normalization From Scratch
Implement 1D batch normalization manually (without using `nn.BatchNorm1d`). Steps: 1. Compute batch mean and variance. 2. Normalize inputs. 3. Scale and shift with learnable $$\gamma, \beta$$....
-
Debug Exploding Gradients
Create a deep feedforward net (20 layers, ReLU). Train it on dummy data. Track gradient norms across layers. Observe if gradients explode. Experiment with: - Smaller learning rate. - Gradient...
-
Implement a Siamese Network
Implement a Siamese network for MNIST digit similarity: - Two identical CNNs sharing weights. - Contrastive loss function. - Train on pairs of digits (same/different). Evaluate on test pairs.
-
Create a Transformer Encoder Block
Implement a single Transformer encoder block: - Multi-head self-attention. - Layer normalization. - Feedforward network. Compare output with `nn.TransformerEncoderLayer`.
-
Distributed DataParallel Basics
Simulate training with `torch.nn.DataParallel`: - Define a simple CNN. - Run it on 2 GPUs (if available). - Verify batch is split across devices. Inspect `model.module` usage.