-
Tracing Gradient Descent on a Parabola
Imagine a simple 1D function $f(x) = x^2 - 4x + 5$. Your goal is to find the minimum of this function using Gradient Descent. 1. **Derive the gradient**: What is $\frac{df}{dx}$? 2. **Perform a...
-
Riding the Momentum Wave in Optimization
Stochastic Gradient Descent (SGD) with momentum is a popular optimization algorithm that often converges faster and more stably than plain SGD. 1. **Update Rule**: The update rule for SGD with...
1