Contenido del curso

Gram-Schmidt: From Messy Basis to Orthonormal

Resumen

The Gram-Schmidt process is an algorithm that turns any messy set of basis vectors into a perfect coordinate system. You will use vector projection to transform a tilted basis into an orthonormal one, the cleanest foundation for working in linear algebra, physics, signal processing, and machine learning.

What is an orthonormal basis and why does it matter?

A basis is a set of linearly independent vectors that generate a space. Think of them as the building blocks of that space. An orthonormal basis takes this idea one step further: every vector is perpendicular to the others and each one has a norm equal to one.

The standard basis with the unit vectors i and j is the textbook example. i equals (1, 0) and j equals (0, 1). Their dot product is zero, which confirms they are orthogonal, and the norm of each is one, which confirms they are unitary. That combination is what makes them orthonormal.

What is an orthonormal basis? It is a set of vectors that are mutually perpendicular and each has a norm of one. It works like a perfectly straight, standardized X and Y axis system.

Working with an orthonormal basis is ideal because projections and coordinate calculations become trivial. Instead of solving systems of equations, you compute a few dot products and you are done.

How do you apply the Gram-Schmidt process step by step?

Let's straighten out a tilted basis in R2 formed by V1 = (3, 1) and V2 = (2, 2) [01:25]. The goal is to convert it into an orthonormal basis.

Step 1: keep the first vector as your anchor

The first orthogonal vector of the new basis is identical to V1. So you define U1 = V1 = (3, 1). This vector sets the direction you will build everything else around.

Step 2: subtract the projection to get the second orthogonal vector

The second vector U2 comes from removing the part of V2 that overlaps with U1. The formula is:

U2 = V2 minus the projection of V2 onto U1

The projection itself uses the dot product divided by the squared norm, multiplied by U1:

  • Dot product V2 · U1 = (2)(3) + (2)(1) = 8.
  • Norm of U1 squared = 3² + 1² = 10.
  • Projection = (8/10) · (3, 1) = (2.4, 0.8).

Now subtract: U2 = (2, 2) − (2.4, 0.8) = (−0.4, 1.2) [04:30].

To confirm orthogonality, check the dot product: (3)(−0.4) + (1)(1.2) = −1.2 + 1.2 = 0. The vectors are perpendicular.

Step 3: normalize to get unit length

Normalization means dividing each vector by its own norm so the length becomes one [05:45].

  • Norm of U1 = √10, so Q1 = (3/√10, 1/√10).
  • Norm of U2 = √(0.16 + 1.44) = √1.6, so Q2 = (−0.4/√1.6, 1.2/√1.6).

Q1 and Q2 are your new orthonormal basis. Plotted on the plane, they look just like i and j, only rotated. They form a 90 degree angle and each has length one.

Why does an orthonormal basis simplify coordinate calculations?

Here is where the payoff shows up. Both the original basis (V1, V2) and the new one (Q1, Q2) generate the exact same plane. You did not change the universe you are describing, you just found a much cleaner coordinate system for it.

Why use Gram-Schmidt instead of solving a linear system? Because once your basis is orthonormal, finding coordinates only requires dot products instead of solving equations.

Say you want the coordinates of W = (5, 3) in the standard basis. The classic approach sets up a linear combination C1·(1,0) + C2·(0,1) = (5, 3) and solves for C1 and C2. With an orthonormal basis you skip that entirely:

  • C1 = W · i = (5)(1) + (3)(0) = 5.
  • C2 = W · j = (5)(0) + (3)(1) = 3.

The result is (5, 3), exactly as expected. The same shortcut works in any orthonormal basis. If you take W = (2, 2) and want its coordinates in the Q1, Q2 basis, you just compute Q1 · W and Q2 · W.

Where is Gram-Schmidt used in real applications?

Finding a better basis to describe a problem is one of the most powerful techniques in science and engineering. It lets you reduce complex problems to their essence.

  • Physics: describing motion with efficient axes.
  • Signal processing: breaking a sound wave into its fundamental frequencies.
  • Statistics and machine learning: identifying the most important patterns in a dataset.

Try computing the coordinates of W = (2, 2) in the Q1, Q2 basis and share your results in the comments.