-
Implementing Gradient Clipping
Implement **gradient clipping** in your training loop. This technique is used to prevent exploding gradients, which can be a problem in RNNs and other deep networks. After the backward pass...
-
Implementing a Custom `nn.Module` for a Gated Recurrent Unit (GRU)
Implement a **custom GRU cell** as a subclass of `torch.nn.Module`. Your implementation should handle the reset gate, update gate, and the new hidden state computation from scratch, using...
-
Implementing a Simple Attention Mechanism
Implement a **simple attention mechanism** for a sequence-to-sequence model. Given a sequence of encoder outputs and a single decoder hidden state, your attention module should compute attention...
-
Implementing a Simple VAE for Text (Sentence VAE)
Implement a **Variational Autoencoder (VAE)** for text, often called a Sentence VAE. The encoder will be an RNN (e.g., GRU) that outputs a latent distribution, and the decoder will be another RNN...
-
Gradient Clipping Example
Write code to: 1. Train a small RNN on dummy data. 2. Add gradient clipping using `torch.nn.utils.clip_grad_norm_`. 3. Print gradient norms before and after clipping. Show that exploding gradients...