How Models Learn From Their Own Errors

Cursos Empresas Blog Live Conf Precios

Contenido del curso

Introducción al Álgebra Lineal para Machine Learning

Operaciones con Vectores y Matrices

Multiplicación de Matrices

Construcción de un Modelo de Regresión Lineal

Tomar examen

How Models Learn From Their Own Errors

Resumen

Machine learning models learn by measuring how wrong they are and then nudging themselves in the right direction. That measurement starts with two basic operations on vectors: subtraction and scalar multiplication. Both are simple in code, but they sit at the core of how a recommendation system, a price predictor or any trained model improves over time.

Think of a movie recommender. The system guesses what you'll like, compares its guess to your actual rating and uses that gap to adjust. That gap is the error, and you can compute it in NumPy with a single line.

How do you calculate the error between prediction and reality?

The error is the difference between what really happened and what the model predicted. In vector form, you subtract one from the other element by element.

Suppose you rate two movies: a science fiction film and a comedy. You give them a 5 and a 2. The model, still in training, predicts a 4 and a 4. With NumPy, the calculation looks like this:

python import numpy as np

real_ratings = np.array([5, 2]) model_prediction = np.array([4, 4])

error = real_ratings - model_prediction print(error) # [ 1 -2]

The result [1, -2] tells you two things. The model fell one point short on the sci fi rating, and overshot the comedy by two points. That signed difference is exactly what training algorithms need to know which way to move.

What is the error vector in machine learning? It's the element-wise difference between the real values and the model's prediction. A positive number means the model underestimated; a negative one means it overshot.

Why not correct the model with the full error?

If the model adjusted itself by the entire error after every example, training would be unstable. Predictions would swing wildly. Instead, the model takes a small step in the right direction, controlled by a value called the learning rate.

A typical small value is 0.1, which means the model corrects only 10% of the error per step. Multiplying the error vector by that scalar gives you the actual adjustment:

python learning_rate = 0.1 adjustment = error * learning_rate print(adjustment) # [ 0.1 -0.2]

This is scalar multiplication: every element of the vector gets multiplied by the same number. Repeated thousands of times, these tiny corrections are how a model learns.

How does matrix addition work in NumPy?

In machine learning you'll mostly run into two kinds of sums: adding two matrices with the same shape, and adding a vector to a full matrix. Both come up constantly when you're combining datasets or applying adjustments.

Imagine you run an e-commerce store. Rows are products, columns are stores, and each cell is the units sold. You have January in one matrix and February in another with the same shape. To get cumulative sales, you just add them.

python sales_january = np.array([[150, 200, 180], [120, 90, 100]]) sales_february = np.array([[130, 110, 210], [80, 120, 190]])

total_sales = sales_january + sales_february print(total_sales)

NumPy adds element by element, returning a new matrix with the same shape. Same shape in, same shape out.

What is broadcasting in NumPy?

Things get more interesting when the shapes don't match. Say you want to apply a per-store bonus to your January sales. The bonus is a vector with three values, one per store, and the sales matrix is 2 by 3.

python bonus = np.array([10, 15, 5]) print(sales_january.shape) # (2, 3) print(bonus.shape) # (3,)

sales_with_bonus = sales_january + bonus

The shapes are different, yet the operation works. NumPy uses broadcasting, a technique that virtually expands the smaller array so its shape matches the larger one. The bonus vector gets repeated across both rows of the matrix, and the sum runs as if both operands were 2 by 3.

What is broadcasting in NumPy? It's a mechanism that lets you operate on arrays of different shapes by virtually expanding the smaller one. The catch: comparing shapes from right to left, each dimension must either match or be 1.

If the bonus vector had four elements instead of three, broadcasting would fail and NumPy would raise an error like operands could not be broadcast together with shapes. The rule is strict, and worth checking before you trust the result.

How can you practice broadcasting without running code?

Here's an exercise to test your intuition. Picture a 4 by 3 matrix that stores the X, Y and Z coordinates of four points. Now define a displacement vector with only two elements, for example [2, -1].

The matrix shape is (4, 3).
The vector shape is (2,).
Compare from right to left: 3 vs 2.

Those dimensions aren't equal, and neither is 1. So broadcasting can't happen and the sum would fail. Working through this kind of shape check by hand, before touching the keyboard, will save you hours of debugging when you start stacking real datasets.

Share your reasoning in the comments and tell me whether you would reshape the vector or rethink the operation entirely.

Gabriel Obregón

Estudiante

🧮 Vectores, matrices y corrección de errores en Machine Learning (NumPy)

🎯 IDEA CENTRAL

En machine learning, los modelos mejoran haciendo ajustes pequeños y controlados usando operaciones matemáticas simples:

➖ Resta de vectores → calcular error

✖️ Multiplicación por escalar → controlar el ajuste

➕ Suma de matrices → combinar datos

🔁 Broadcasting → aplicar un vector a una matriz

👉 Objetivo: mejorar predicciones sin cambios bruscos.

🧠 1. CÁLCULO DEL ERROR

❓ ¿Qué es el error?

El error mide la diferencia entre la realidad y la predicción del modelo.

📌 Definición:

· Error = valor real − predicción

🔍 Interpretación del signo

· ➕ Error positivo → el modelo se quedó corto

· ➖ Error negativo → el modelo se pasó

🧪 Ejemplo rápido

· Reales: [5, 2]

· Predicción: [4, 4]

🧾 Resultado:

· Error → [1, -2]

⚙️ 2. AJUSTE CON LEARNING RATE

🎚️ ¿Qué es el learning rate?

Es un número escalar que controla cuánto se corrige el modelo.

📉 Valor típico:

· 0.1 → corrige solo el 10 % del error

⚖️ Por qué es importante

· 🚀 Muy grande → cambios bruscos e inestables

· 🐢 Muy pequeño → aprendizaje lento pero seguro

🧮 Ejemplo conceptual

· Error: [1, -2]

· Learning rate: 0.1

🔧 Ajuste aplicado:

· [0.1, -0.2]

🧩 3. SUMA DE MATRICES (MISMA FORMA)

✅ Cuándo se puede sumar

Dos matrices se pueden sumar si tienen:

✔️ Mismo número de filas ✔️ Mismo número de columnas

➕ Qué ocurre

· La suma es elemento a elemento

📦 Ejemplo mental:

· Ventas de enero ➕ Ventas de febrero ➡️ Ventas totales

🔁 4. BROADCASTING (VECTOR + MATRIZ)

🧠 ¿Qué es el broadcasting?

Es una regla de NumPy que permite sumar:

🟦 una matriz grande ➕ 🟩 un vector pequeño

siempre que sus formas sean compatibles.

🧪 Ejemplo típico

· Matriz: (2, 3) → productos × tiendas

· Vector: (3,) → bonificación por tienda

🔄 El vector se aplica automáticamente a cada fila.

📐 5. REGLAS CLAVE DE BROADCASTING

🧭 Cómo se comparan las formas

· Se comparan de derecha a izquierda

📏 Regla por eje

Debe cumplirse al menos una:

✔️ Son iguales ✔️ Uno de los valores es 1

🟢 Casos válidos

· (2, 3) + (3,) → ✔️

· (2, 3) + (1, 3) → ✔️

🔴 Casos inválidos

· (2, 3) + (4,) → ❌

o 3 ≠ 4

o ninguno es 1

Introducción al Álgebra Lineal para Machine Learning

Linear Algebra Behind AI Recommendations

Google Colab Setup for Machine Learning Python

NumPy Arrays and Matplotlib Visualized

Vectors, Matrices, and Tensors in NumPy

Operaciones con Vectores y Matrices