-
Batch Normalization From Scratch
Implement 1D batch normalization manually (without using `nn.BatchNorm1d`). Steps: 1. Compute batch mean and variance. 2. Normalize inputs. 3. Scale and shift with learnable $$\gamma, \beta$$....
-
Debug Exploding Gradients
Create a deep feedforward net (20 layers, ReLU). Train it on dummy data. Track gradient norms across layers. Observe if gradients explode. Experiment with: - Smaller learning rate. - Gradient...
-
Implement a Siamese Network
Implement a Siamese network for MNIST digit similarity: - Two identical CNNs sharing weights. - Contrastive loss function. - Train on pairs of digits (same/different). Evaluate on test pairs.
-
Create a Transformer Encoder Block
Implement a single Transformer encoder block: - Multi-head self-attention. - Layer normalization. - Feedforward network. Compare output with `nn.TransformerEncoderLayer`.
-
Distributed DataParallel Basics
Simulate training with `torch.nn.DataParallel`: - Define a simple CNN. - Run it on 2 GPUs (if available). - Verify batch is split across devices. Inspect `model.module` usage.