-
Tensor Manipulation: Implement Layer Normalization
### Description Layer Normalization is a key component in many modern deep learning models, especially Transformers. It normalizes the inputs across the feature dimension. Your task is to...
-
Tensor Manipulation: Numerically Stable Softmax
### Description Implement the softmax function, which converts a vector of numbers into a probability distribution. A naive implementation can be numerically unstable if the input values are very...
-
Einops: Reversing a Sequence
### Description Reversing the order of elements in a sequence is a common operation. While it can be done with slicing (`torch.flip`), let's practice doing it with `einops` for a different...
-
Tensor Manipulation: Dropout Layer
### Description Implement the dropout layer from scratch. During training, dropout randomly zeroes some of the elements of the input tensor with probability `p`. The remaining elements are scaled...
-
Tensor Manipulation: One-Hot Encoding
### Description Implement one-hot encoding for a batch of class indices. Given a 1D tensor of integer labels, create a 2D tensor where each row is a vector of zeros except for a `1` at the index...
-
Einops: Batched Matrix Multiplication
### Description Perform a batched matrix multiplication `(B, N, D) @ (B, D, M) -> (B, N, M)` using `einops` `einsum`. While `torch.bmm` is the standard, this is a good exercise to understand how...
-
Tensor Manipulation: Using `gather` for selection
### Description `torch.gather` is a powerful but sometimes confusing function for selecting elements from a tensor based on an index tensor. Your task is to use it to select specific elements...
-
Tensor Manipulation: Creating `unfold` with `as_strided`
### Description **Warning: `as_strided` is an advanced and potentially unsafe operation that can crash your program if used incorrectly, as it creates a view on memory without checks.** With that...
-
Einops: Repeat for Tiling/Broadcasting
### Description The `einops.repeat` function is a powerful and readable alternative to `torch.expand` or `torch.tile` for broadcasting or repeating a tensor along new or existing dimensions. ###...
-
Tensor Manipulation: Using `scatter_add_`
### Description `torch.scatter_add_` is used to add values into a tensor at specified indices. It's useful in cases like converting an edge list in a graph to an adjacency matrix or pooling...
-
Implement a Neural Ordinary Differential Equation
### Description Instead of modeling a function directly, a Neural ODE models its derivative with a neural network. The output is then found by integrating this derivative over time. [1] Your task...
-
Model-Agnostic Meta-Learning (MAML) Update Step
### Description Model-Agnostic Meta-Learning (MAML) is a meta-learning algorithm that trains a model's initial parameters such that it can adapt to a new task with only a few gradient steps. [1]...
-
Build a Transformer Encoder Block from Scratch
### Description The Transformer architecture is built upon a fundamental component: the Encoder block. [1] Each block is responsible for processing a sequence of embeddings and refining them. Your...
-
Differentiable Additive Synthesizer
### Description Differentiable Digital Signal Processing (DDSP) is a technique that combines classic signal processing with deep learning by making the parameters of synthesizers learnable via...
-
Soft Actor-Critic (SAC) Critic Loss
### Description Soft Actor-Critic (SAC) is a state-of-the-art reinforcement learning algorithm known for its stability and sample efficiency. [1] A key component is its critic (or Q-network)...
-
Implement a Knowledge Distillation Loss
### Description Knowledge Distillation is a model compression technique where a small "student" model is trained to mimic a larger, pre-trained "teacher" model. [1] This is achieved by training...
-
Masked Autoencoder (MAE) Input Preprocessing
### Description Masked Autoencoders (MAE) are a powerful self-supervised learning technique for vision transformers. The core idea is simple: randomly mask a large portion of the input image...
-
Neural Cellular Automata (NCA) Update Step
### Description Neural Cellular Automata (NCA) are a fascinating generative model where complex global patterns emerge from simple, local rules learned by a neural network. [1] A grid of "cells,"...
-
Bayesian Neural Network Layer
### Description In a standard neural network, weights are single point estimates. In a Bayesian Neural Network (BNN), we learn a probability distribution over each weight. [1] This allows for...
-
Deep Canonical Correlation Analysis (DCCA) Loss
### Description Canonical Correlation Analysis (CCA) is a statistical method for finding correlations between two sets of variables. Deep CCA (DCCA) uses neural networks to first project two...
-
Siamese Network for One-Shot Image Verification
### Description Your task is to implement a Siamese network that can determine if two images are of the same class, given only one or a few examples of that class at test time. You'll train a...
-
Physics-Informed Neural Network (PINN) for an ODE
### Description Solve a simple Ordinary Differential Equation (ODE) using a Physics-Informed Neural Network. A PINN is a neural network that is trained to satisfy both the data and the underlying...
-
Graph Convolutional Network for Node Classification
### Description Implement a simple Graph Convolutional Network (GCN) to perform node classification on a graph dataset like Cora. [1] A GCN layer aggregates information from a node's neighbors to...
-
HyperNetwork for Weight Generation
### Description Implement a simple HyperNetwork. A HyperNetwork is a neural network that generates the weights for another, larger network (the "target network"). [1] This allows for dynamic...
-
Normalizing Flow for Density Estimation
### Description Implement a simple 2D Normalizing Flow model. Normalizing Flows transform a simple base distribution (like a Gaussian) into a more complex distribution by applying a sequence of...