All Katas - ML Katas

Implementing a Custom `nn.Module` for a Gated Recurrent Unit (GRU)

this year medium (<1 hr) | rnn gru custom module recurrent

Implement a **custom GRU cell** as a subclass of `torch.nn.Module`. Your implementation should handle the reset gate, update gate, and the new hidden state computation from scratch, using...

Custom Data Augmentation Pipeline

this year medium (<30 mins) | custom data augmentation transforms pipeline

Create a **custom data augmentation pipeline** using PyTorch's `transforms`. For a given dataset (e.g., a custom image dataset), implement a series of transformations like random rotation,...

Implementing a Custom Learning Rate Scheduler

this year medium (<1 hr) | training custom scheduler learning rate

Implement a **custom learning rate scheduler** that follows a cosine annealing schedule. The learning rate starts high and decreases smoothly to a minimum value, then resets and repeats. Your...

Transfer Learning with a Pre-trained Model

this year medium (<1 hr) | cnn transfer learning resnet fine-tuning

Fine-tune a **pre-trained model** (e.g., `resnet18` from `torchvision.models`) on a new, small image classification dataset (e.g., `CIFAR-10`). You'll need to freeze the weights of the initial...

Implementing a Simple Attention Mechanism

this year medium (<1 hr) | rnn attention mechanism seq2seq weights

Implement a **simple attention mechanism** for a sequence-to-sequence model. Given a sequence of encoder outputs and a single decoder hidden state, your attention module should compute attention...

Building a Custom `Dataset` and `DataLoader`

this year medium (<30 mins) | dataset dataloader custom data pipeline

Create a **custom `torch.utils.data.Dataset` class** to load a simple, non-image dataset (e.g., from a CSV file). The `__init__` method should read the data, `__len__` should return the total...

Implementing Weight Initialization Schemes

this year medium (<1 hr) | training initialization weight he xavier

Implement **different weight initialization schemes** (e.g., Xavier/Glorot, He) for a simple neural network. Create a function that iterates through a model's parameters and applies a chosen...

Implementing Layer Normalization from Scratch

this year medium (<30 mins) | training custom module normalization layer

Implement **Layer Normalization** as a custom `torch.nn.Module`. Unlike `BatchNorm`, `LayerNorm` normalizes across the features of a single sample, not a batch. Your implementation should...

Manual Gradient Descent Step

this year medium (<30 mins) | autograd optimization gradient descent

Simulate one step of gradient descent for a simple quadratic loss. ### Problem Given a scalar parameter $w$ initialized at 5.0, minimize the loss $L(w) = (w - 3)^2$ using PyTorch. - **Input:**...

Custom Dataset Class

this year medium (<30 mins) | dataloader nn datasets

Create a custom PyTorch `Dataset` for pairs of numbers and their sum. ### Problem Implement a dataset where each sample is `(x, y, x+y)`. - **Input:** A list of tuples `(x, y)`. - **Output:** For...

Implement a Simple MLP

this year medium (<30 mins) | mlp nn forward pass

Build and run a minimal Multi-Layer Perceptron (MLP) using `torch.nn`. ### Problem Construct a 2-layer MLP with ReLU activation for input of size 10 and output of size 2. - **Input:** Tensor of...

Implement a Custom Loss Function

this year medium (<30 mins) | pytorch loss functions regression implementation

Create a custom loss function called `MeanAbsolutePercentageError` (MAPE) in PyTorch. It should: 1. Take predictions and targets as input tensors. 2. Compute $$\frac{1}{n} \sum_i \frac{|y_i -...

Custom Dataset for CSV Data

this year medium (<1 hr) | pytorch dataset dataloader tabular data

Write a PyTorch `Dataset` class that loads data from a CSV file containing tabular data (features + labels). Requirements: - Use `pandas` to read the CSV. - Convert features and labels to tensors....

Gradient Clipping Example

this year medium (<30 mins) | pytorch rnn training gradients

Write code to: 1. Train a small RNN on dummy data. 2. Add gradient clipping using `torch.nn.utils.clip_grad_norm_`. 3. Print gradient norms before and after clipping. Show that exploding gradients...

Implement Dropout Manually

this year medium (<1 hr) | pytorch implementation dropout regularization

Implement dropout as a function `my_dropout(x, p)`: - Zero out elements of `x` with probability `p`. - Scale survivors by $$1/(1-p)$$. - Ensure deterministic behavior when `torch.manual_seed` is...

Custom Activation Function

this year medium (<1 hr) | pytorch implementation activations mlp

Define a custom activation function called `Swish`: $$f(x) = x \cdot \sigma(x)$$. - Implement it as a PyTorch `nn.Module`. - Train a small MLP on random data with it. - Compare with ReLU...

Weight Initialization Techniques

this year medium (<30 mins) | pytorch deep learning training initialization

Initialize a neural network's weights using different schemes: - Xavier initialization. - Kaiming initialization. Show histograms of weight distributions before and after initialization.

Custom Collate Function

this year medium (<30 mins) | pytorch dataloader sequences collate

Write a custom `collate_fn` for `DataLoader` that pads variable-length sequences with zeros. Use `torch.nn.utils.rnn.pad_sequence`. Test by batching random-length tensors.

Visualize Training with TensorBoard

this year medium (<1 hr) | pytorch training tensorboard visualization

Integrate TensorBoard into a training loop: - Log training loss and validation accuracy. - Add histograms of weights and gradients. - Write a few sample images. Open TensorBoard and verify logs.

Gradient Accumulation Example

this year medium (<30 mins) | pytorch implementation training gradients

Simulate large-batch training using gradient accumulation: - Train with microbatches of size 4. - Accumulate gradients over 8 steps. - Update optimizer after accumulation. Verify final result...

Implement Early Stopping

this year medium (<1 hr) | pytorch training basics early stopping

Add early stopping to a training loop: - Monitor validation loss. - Stop training if no improvement after 5 epochs. - Save best model checkpoint. Demonstrate on MNIST subset.

Implement Label Smoothing

this year medium (<30 mins) | pytorch training label smoothing classification

Write a function to apply label smoothing for classification: - Replace one-hot targets with $$1-\epsilon$$ for true class, $$\epsilon/(K-1)$$ for others. - Use it in cross-entropy training. Show...

Save and Load TorchScript Model

this year medium (<30 mins) | pytorch torchscript model deployment serialization

Convert a trained PyTorch model to TorchScript via tracing and scripting. Save it to disk. Reload and run inference. Compare outputs with the original model.

Mixed Precision Training with autocast

this year medium (<30 mins) | pytorch training mixed precision gpu

Modify a training loop to use `torch.cuda.amp.autocast`: - Wrap forward + loss in `autocast`. - Use `GradScaler` for backward. Compare training speed vs. full precision.