Solving Linear Regression with NumPy

Resumen

Solving a linear regression model comes down to one elegant equation from linear algebra: Aθ = B. Once you have matrices A and B, the only thing left is to find theta, the weight vector that holds your model's knowledge. Here you'll learn how to clear that unknown using NumPy, compare two valid approaches, and visualize the best fit line you just trained.

How do you solve Aθ = B using the inverse matrix?

If A is invertible, you can multiply both sides by its inverse and isolate theta. It's the textbook move from linear algebra fundamentals, and it works perfectly when conditions are met.

First, you confirm A is invertible. A quick check with np.linalg.norm(A) tells you the norm is greater than zero, and A.shape confirms it's a 3x3 square matrix. Both boxes ticked.

Then you compute the inverse and solve:

python A_inv = np.linalg.inv(A) theta_inv = A_inv @ B print(f"Model weights theta: {np.round(theta_inv, 2)}")

Each number in the resulting vector maps to a column of your data. The second value corresponds to square meters, the third to the number of rooms, and the first one is the intercept, tied to the bias column you added earlier.

What does the intercept mean in linear regression? It's the model's starting price for a theoretical house with zero square meters and zero rooms. From there, the model adds or subtracts value based on each feature.

Why use np.linalg.solve instead of computing the inverse?

Calculating the inverse is theoretically clean, but in professional practice it's rarely the first choice. It's computationally expensive and can become numerically unstable with large matrices.

The preferred path is to solve the system directly without inverting A:

python theta_solve = np.linalg.solve(A, B) print(f"Model weights theta with solve: {np.round(theta_solve, 2)}")

In this small case, both methods return identical results. But np.linalg.solve is faster and more robust, and it's what you'll reach for when problems scale up. Knowing the inverse path matters too, because it grounds you in why the math works.

When should I use np.linalg.solve over np.linalg.inv? Use solve whenever you can. It's faster, more numerically stable, and avoids the overhead of building an inverse matrix you don't actually need.

How do you predict prices and plot the regression line?

With theta in hand, predicting prices is a single matrix multiplication. You take your bias matrix X_bias and multiply it by theta:

python theta = theta_solve predictions = X_bias @ theta print(f"Real prices: {y}") print(f"Predicted prices: {np.round(predictions, 2)}")

The model returns four predictions against four real prices:

  • House 1: real 310, predicted 308.
  • House 2: real 390, predicted 407.37.
  • House 3: real 325, predicted 317.28.
  • House 4: real 530, predicted 522.28.

This is the best approximation possible given the data. It's the line of best fit your model could draw.

Plotting the best fit line with Matplotlib

To visualize the result, you build a scatter plot of the real data and overlay the regression line. The trick is generating two new houses to anchor the line and computing their predictions:

python plt.figure(figsize=(8,6)) plt.scatter(X[:,0], y, color='blue', label='Real data')

X_line = np.array([[50],[160]]) X_line_bias = np.c_[np.ones((2,1)), X_line, np.array([[2],[4]])] y_line = X_line_bias @ theta

plt.plot(X_line, y_line, 'r-', label='Linear regression line') plt.xlabel('Square meters') plt.ylabel('Price in thousands') plt.title('Our linear regression model') plt.grid() plt.legend() plt.show()

The line predicts 193.15 for a 50 square meter house with 2 rooms and 547 for a 160 square meter house with 4 rooms. Visually, the red line cuts cleanly through the blue points, confirming the model trained correctly using only linear algebra.

What does the normal equation actually solve?

Your data didn't form a perfect line. In linear algebra terms, the price vector y lived outside the column space of X, which made the system Xθ = y inconsistent and without an exact solution.

Since no theta gives you the real prices exactly, you settled for the best approximation: the orthogonal projection of y onto the column space of X. That projection is the shadow of y on the plane, and it's exactly what your predictions represent.

The normal equation is the tool that made this possible. By identifying components A and B, then solving with either the inverse or np.linalg.solve, you found the optimal weights that draw the best fit line through your data.

Practice: predict the price of a 130m² house with 3 rooms

Use the theta vector you just calculated and answer this: what's the predicted price for a house of 130 square meters with 3 rooms?

Quick hint: build the vector [1, 130, 3], remembering to include the bias at the start, then compute the dot product with theta. Drop your result in the comments.

Your model worked, but what made matrix A solvable in the first place? What happens when it isn't? Next you'll learn how the determinant and rank diagnose the state of your system and flag problems before they break your code.