All Katas - ML Katas

Einops Warm-up: Reshaping Tensors for Expert Batching

this year easy (<30 mins) | pytorch einops tensor-manipulation

In Mixture of Experts (MoE) models, we often need to reshape tensors to efficiently process data across multiple 'experts'. Imagine you have a batch of sequences, and for each token in each...

Sparse MoE Top-K Gating

this year easy (<10 mins) | pytorch einops gating moe

### Description In a Mixture of Experts (MoE) model, the gating network is a crucial component that determines which 'expert' subnetworks process each token. [1] A common strategy is **top-k...

Hierarchical Patch Merging with Einops

this year easy (<10 mins) | pytorch einops vision swin

### Description In hierarchical vision transformers like the Swin Transformer, **patch merging** is used to downsample the feature map, effectively reducing the number of tokens while increasing...

Implement a Linear Regression Model

this year easy (<30 mins) | pytorch training linear regression basics

Build a simple linear regression model using `nn.Module`. Requirements: - One input feature, one output. - Train it on synthetic data $$y = 3x + 2 + \epsilon$$. - Use `MSELoss` and `SGD`. Check...

Checkpointing with torch.save

this year easy (<30 mins) | pytorch training basics checkpoints

Train a simple feedforward model for 1 epoch. Save: 1. Model state dict. 2. Optimizer state dict. 3. Epoch number. Then load the checkpoint and resume training seamlessly.