ML Katas

Gradient Accumulation Example

medium (<30 mins) pytorch implementation training gradients
this month by E

Simulate large-batch training using gradient accumulation:

  • Train with microbatches of size 4.
  • Accumulate gradients over 8 steps.
  • Update optimizer after accumulation.

Verify final result matches batch size 32 training.