-
Weight Initialization Techniques
Initialize a neural network's weights using different schemes: - Xavier initialization. - Kaiming initialization. Show histograms of weight distributions before and after initialization.
-
Debug Exploding Gradients
Create a deep feedforward net (20 layers, ReLU). Train it on dummy data. Track gradient norms across layers. Observe if gradients explode. Experiment with: - Smaller learning rate. - Gradient...
-
Custom Collate Function
Write a custom `collate_fn` for `DataLoader` that pads variable-length sequences with zeros. Use `torch.nn.utils.rnn.pad_sequence`. Test by batching random-length tensors.
-
Visualize Training with TensorBoard
Integrate TensorBoard into a training loop: - Log training loss and validation accuracy. - Add histograms of weights and gradients. - Write a few sample images. Open TensorBoard and verify logs.
-
Implement a Siamese Network
Implement a Siamese network for MNIST digit similarity: - Two identical CNNs sharing weights. - Contrastive loss function. - Train on pairs of digits (same/different). Evaluate on test pairs.
-
Gradient Accumulation Example
Simulate large-batch training using gradient accumulation: - Train with microbatches of size 4. - Accumulate gradients over 8 steps. - Update optimizer after accumulation. Verify final result...
-
Implement Early Stopping
Add early stopping to a training loop: - Monitor validation loss. - Stop training if no improvement after 5 epochs. - Save best model checkpoint. Demonstrate on MNIST subset.
-
Create a Transformer Encoder Block
Implement a single Transformer encoder block: - Multi-head self-attention. - Layer normalization. - Feedforward network. Compare output with `nn.TransformerEncoderLayer`.
-
Implement Label Smoothing
Write a function to apply label smoothing for classification: - Replace one-hot targets with $$1-\epsilon$$ for true class, $$\epsilon/(K-1)$$ for others. - Use it in cross-entropy training. Show...
-
Save and Load TorchScript Model
Convert a trained PyTorch model to TorchScript via tracing and scripting. Save it to disk. Reload and run inference. Compare outputs with the original model.
-
Distributed DataParallel Basics
Simulate training with `torch.nn.DataParallel`: - Define a simple CNN. - Run it on 2 GPUs (if available). - Verify batch is split across devices. Inspect `model.module` usage.
-
Mixed Precision Training with autocast
Modify a training loop to use `torch.cuda.amp.autocast`: - Wrap forward + loss in `autocast`. - Use `GradScaler` for backward. Compare training speed vs. full precision.