Linear Regression via Gradient Descent

medium (<1 hr) optimization linear-regression calculus gradient-descent machine-learning

this year by E

Linear regression is a foundational supervised learning algorithm. Given a dataset of input features $X$ and corresponding target values $y$ , the goal is to find a linear relationship $y = X 𝐰 + b$ that best fits the data. The "best fit" is typically defined by minimizing the Mean Squared Error (MSE) loss function.

Your task is to implement linear regression using batch gradient descent to find the optimal weights $𝐰$ and bias $b$ .

The MSE loss function for $N$ samples is:

L (𝐰, b) = \frac{1}{N} \sum_{i = 1}^{N} (y_{i} - (𝐱_{i}^{T} 𝐰 + b))^{2}

You need to derive the gradients of $L$ with respect to $𝐰$ and $b$ :

\frac{\partial L}{\partial 𝐰} = \dots \frac{\partial L}{\partial b} = \dots

And then update $𝐰$ and $b$ iteratively:

𝐰 \leftarrow 𝐰 - η \frac{\partial L}{\partial 𝐰} b \leftarrow b - η \frac{\partial L}{\partial b}

where $η$ is the learning rate.

Implementation Details: Implement a Python class MyLinearRegression with the following methods: * __init__(self, learning_rate=0.01, n_iterations=1000): Initializes learning rate and number of iterations. * fit(self, X, y): Trains the model using gradient descent. * Initialize $𝐰$ and $b$ randomly or to zeros. * Perform n_iterations of gradient descent. In each iteration: * Calculate predictions $\hat{y}$ . * Calculate the loss. * Calculate gradients $\frac{\partial L}{\partial 𝐰}$ and $\frac{\partial L}{\partial b}$ . * Update $𝐰$ and $b$ . * Store the history of loss values. * predict(self, X): Makes predictions on new data X.

Verification: 1. Generate a synthetic dataset: X = 2 * np.random.rand(100, 1), y = 4 + 3 * X + np.random.randn(100, 1). 2. Train your MyLinearRegression model on this dataset. 3. Plot the loss history to ensure it converges. 4. Compare your learned $𝐰$ and $b$ values with the true values (3 and 4 in the example) and with the results from sklearn.linear_model.LinearRegression. The parameters should be very close. 5. Plot the regression line learned by your model against the data points.