All Katas - ML Katas

Model Compression with Pruning

this year medium (>1 hr) | training pruning model compression sparsify

Implement **model pruning** to reduce the size and computational cost of a trained model. Start with a simple, over-parameterized model (e.g., a fully-connected network on MNIST). Train it to a...

Adversarial Training for Robustness

this year hard (>1 hr) | training cnn adversarial robustness fgsm

Implement **adversarial training** on a simple classification model like a small CNN on MNIST. The goal is to make the model robust to adversarial attacks. You'll need to generate adversarial...

Generative Adversarial Network (GAN) on MNIST

this year hard (>1 hr) | training adversarial generative mnist gan

Implement and train a simple **Generative Adversarial Network (GAN)**. The network consists of a generator and a discriminator. The generator takes a random noise vector and tries to generate a...

Distributed Data Parallel Training

this year hard (>1 hr) | training distributed parallel multi-gpu speed

Set up a **distributed data parallel training** script using `torch.nn.parallel.DistributedDataParallel` and `torch.distributed`. You'll need to use `torch.multiprocessing.spawn` to launch...

Debug Exploding Gradients

this year hard (>1 hr) | pytorch training gradients debugging

Create a deep feedforward net (20 layers, ReLU). Train it on dummy data. Track gradient norms across layers. Observe if gradients explode. Experiment with: - Smaller learning rate. - Gradient...

Distributed DataParallel Basics

this year hard (>1 hr) | pytorch training distributed dataparallel

Simulate training with `torch.nn.DataParallel`: - Define a simple CNN. - Run it on 2 GPUs (if available). - Verify batch is split across devices. Inspect `model.module` usage.