Tensor Manipulation: Dropout Layer

medium (<10 mins) pytorch deep learning tensor

today by E

Description

Implement the dropout layer from scratch. During training, dropout randomly zeroes some of the elements of the input tensor with probability p. The remaining elements are scaled up by 1 / (1 - p) to keep the overall scale the same. It is only active during training.

Guidance

Check if the model is in training mode.
Create a random binary mask with the same shape as the input tensor, where the probability of a 1 is 1 - p.
Apply the mask to the input.
Scale the result.

Starter Code

import torch

def custom_dropout(x, p=0.5, training=True):
    if not training or p == 0:
        return x

    # Create a mask
    mask = (torch.rand(x.shape) > p).float().to(x.device)

    # Scale and apply the mask
    return (x * mask) / (1.0 - p)

Verification

Create a tensor of ones. After applying your dropout function in training mode, some elements should be zero and others should be 1 / (1 - p). The mean of the output tensor should be close to 1. In eval mode (training=False), the output should be identical to the input.