All Katas - ML Katas

MoE Aggregator: Combining Expert Outputs

this year hard (>1 hr) | pytorch einops moe

After tokens have been dispatched to and processed by their respective experts, the outputs need to be combined based on the weights from the gating network. This exercise focuses on this...

Building a Simple Mixture of Experts (MoE) Layer

this year hard (>1 hr) | pytorch deep-learning moe

Now, let's combine the concepts of dispatching and aggregating into a full, albeit simplified, `torch.nn.Module` for a Mixture of Experts layer. This layer will replace a standard feed-forward...

Implement a Neural Ordinary Differential Equation

this year hard (<30 mins) | pytorch generative neural ode dynamics

### Description Instead of modeling a function directly, a Neural ODE models its derivative with a neural network. The output is then found by integrating this derivative over time. [1] Your task...

Model-Agnostic Meta-Learning (MAML) Update Step

this year hard (<30 mins) | pytorch meta-learning maml few-shot

### Description Model-Agnostic Meta-Learning (MAML) is a meta-learning algorithm that trains a model's initial parameters such that it can adapt to a new task with only a few gradient steps. [1]...

Build a Transformer Encoder Block from Scratch

this year hard (<30 mins) | pytorch transformer attention nlp

### Description The Transformer architecture is built upon a fundamental component: the Encoder block. [1] Each block is responsible for processing a sequence of embeddings and refining them. Your...

Soft Actor-Critic (SAC) Critic Loss

this year hard (<30 mins) | pytorch reinforcement rl sac actor-critic

### Description Soft Actor-Critic (SAC) is a state-of-the-art reinforcement learning algorithm known for its stability and sample efficiency. [1] A key component is its critic (or Q-network)...

Neural Cellular Automata (NCA) Update Step

this year hard (<30 mins) | pytorch generative nca alife complex systems

### Description Neural Cellular Automata (NCA) are a fascinating generative model where complex global patterns emerge from simple, local rules learned by a neural network. [1] A grid of "cells,"...

Bayesian Neural Network Layer

this year hard (<30 mins) | pytorch bayesian bnn uncertainty

### Description In a standard neural network, weights are single point estimates. In a Bayesian Neural Network (BNN), we learn a probability distribution over each weight. [1] This allows for...

Siamese Network for One-Shot Image Verification

this year hard (<30 mins) | pytorch siamese metric learning one-shot

### Description Your task is to implement a Siamese network that can determine if two images are of the same class, given only one or a few examples of that class at test time. You'll train a...

Physics-Informed Neural Network (PINN) for an ODE

this year hard (<30 mins) | pytorch autograd ode pinn physics

### Description Solve a simple Ordinary Differential Equation (ODE) using a Physics-Informed Neural Network. A PINN is a neural network that is trained to satisfy both the data and the underlying...

Graph Convolutional Network for Node Classification

this year hard (<30 mins) | pytorch gnn graph gcn

### Description Implement a simple Graph Convolutional Network (GCN) to perform node classification on a graph dataset like Cora. [1] A GCN layer aggregates information from a node's neighbors to...

HyperNetwork for Weight Generation

this year hard (<30 mins) | pytorch hypernetwork meta-learning

### Description Implement a simple HyperNetwork. A HyperNetwork is a neural network that generates the weights for another, larger network (the "target network"). [1] This allows for dynamic...

Normalizing Flow for Density Estimation

this year hard (<30 mins) | pytorch generative normalizing flow

### Description Implement a simple 2D Normalizing Flow model. Normalizing Flows transform a simple base distribution (like a Gaussian) into a more complex distribution by applying a sequence of...

Spiking Neuron with Leaky Integrate-and-Fire

this year hard (<30 mins) | pytorch snn spiking neuroscience

### Description Implement a single Leaky Integrate-and-Fire (LIF) neuron, the fundamental building block of many Spiking Neural Networks (SNNs). Unlike traditional neurons, LIF neurons operate on...

Batch Normalization From Scratch

this year hard (>1 hr) | pytorch implementation batchnorm deep learning

Implement 1D batch normalization manually (without using `nn.BatchNorm1d`). Steps: 1. Compute batch mean and variance. 2. Normalize inputs. 3. Scale and shift with learnable $$\gamma, \beta$$....

Debug Exploding Gradients

this year hard (>1 hr) | pytorch training gradients debugging

Create a deep feedforward net (20 layers, ReLU). Train it on dummy data. Track gradient norms across layers. Observe if gradients explode. Experiment with: - Smaller learning rate. - Gradient...

Implement a Siamese Network

this year hard (>1 hr) | pytorch siamese cnn metric learning

Implement a Siamese network for MNIST digit similarity: - Two identical CNNs sharing weights. - Contrastive loss function. - Train on pairs of digits (same/different). Evaluate on test pairs.

Create a Transformer Encoder Block

this year hard (>1 hr) | pytorch implementation transformer attention

Implement a single Transformer encoder block: - Multi-head self-attention. - Layer normalization. - Feedforward network. Compare output with `nn.TransformerEncoderLayer`.

Distributed DataParallel Basics

this year hard (>1 hr) | pytorch training distributed dataparallel

Simulate training with `torch.nn.DataParallel`: - Define a simple CNN. - Run it on 2 GPUs (if available). - Verify batch is split across devices. Inspect `model.module` usage.