All Katas - ML Katas

by: newest upvotes saves

Einops Warm-up: Reshaping Tensors for Expert Batching

this year easy (<30 mins) | pytorch einops tensor-manipulation

In Mixture of Experts (MoE) models, we often need to reshape tensors to efficiently process data across multiple 'experts'. Imagine you have a batch of sequences, and for each token in each...
MoE Gating and Dispatch

this year medium (<1 hr) | pytorch moe tensor-manipulation

A core component of a Mixture of Experts model is the 'gating network' which determines which expert(s) each token should be sent to. This is often a `top-k` selection. Your task is to implement...