-
Gradient Clipping Example
Write code to: 1. Train a small RNN on dummy data. 2. Add gradient clipping using `torch.nn.utils.clip_grad_norm_`. 3. Print gradient norms before and after clipping. Show that exploding gradients...
-
Debug Exploding Gradients
Create a deep feedforward net (20 layers, ReLU). Train it on dummy data. Track gradient norms across layers. Observe if gradients explode. Experiment with: - Smaller learning rate. - Gradient...
-
Gradient Accumulation Example
Simulate large-batch training using gradient accumulation: - Train with microbatches of size 4. - Accumulate gradients over 8 steps. - Update optimizer after accumulation. Verify final result...
1