Matrix diagonalization lets you rewrite a complex transformation matrix as a simple scaling along its eigenvectors, turning a tangled rotation-and-stretch into pure stretching. If you've followed transformations, change of basis, and eigenvectors, this is where everything clicks into a single, elegant formula useful for engineering, data analysis, and graphics.
What does it mean to diagonalize a matrix?
Think back to change of basis: the same vector looks different depending on the language you use to describe it. Diagonalization takes that idea further. It says that if you switch into the language of the eigenvectors, a complex transformation becomes nothing more than stretching along each axis.
In the standard basis with i-hat and j-hat, a transformation can mix rotation, shear, and scaling. But if you pick the eigenvectors as your new basis vectors, something beautiful happens. The first eigenvector V1 only stretches by lambda 1. The second eigenvector V2 only stretches by lambda 2. No rotation. No shear. Just scaling.
That's why the matrix in this new language is diagonal. Following the golden rule that the columns of a matrix tell you where the basis vectors land, you get a matrix D with lambda 1 and lambda 2 on the diagonal and zeros everywhere else. This is the purest, simplest version of your original transformation.
What is a diagonal matrix? A square matrix whose only nonzero entries sit on the main diagonal. Its action is pure scaling along each axis, with no rotation or shear.
How does the diagonalization formula A = PDP⁻¹ work?
The translation between your standard language and the eigenvector language is captured by one equation: A = P · D · P⁻¹. Each piece has a clear job, and the structure echoes the change of basis formula you saw earlier in the course.
- P is the change of basis matrix, and its columns are the eigenvectors of A. It translates from the standard language into the eigenvector language.
- D is the diagonal matrix holding the eigenvalues on its diagonal. It performs the pure scaling in the new language.
- P⁻¹ is the inverse change of basis. It translates the result back into your original coordinates.
Read from right to left, the formula tells a story: translate, scale, translate back. That's all diagonalization is doing.
How do I diagonalize a 2x2 matrix step by step?
Let's use the matrix A from the previous class, with entries 3, 0 in the first row and 1, 2 in the second. Its eigenvalues are lambda 1 = 3 and lambda 2 = 2, so the diagonal matrix D has 3 and 2 on its diagonal.
The matrix P is built from the eigenvectors. The first is (1, 0) and the second is (1, -1), so P has those as its columns. To find P⁻¹, you use the formula 1 over the determinant of P times the adjusted matrix. The determinant here is -1, you swap the main diagonal and negate the off-diagonal entries, and you end up with P⁻¹ equal to P itself in this case: columns (1, 0) and (1, -1).
Multiplying P · D · P⁻¹ from right to left, the first product gives a matrix with columns (3, 0) and (3, 2). Multiplying that by P on the left returns columns (3, 0) and (1, 2), which is exactly the original A. The formula works perfectly.
Why is matrix diagonalization useful in real applications?
One of the most powerful uses is computing matrix powers. Imagine you need to raise A to the 100. Doing that directly is exhausting. With diagonalization, the work collapses.
If A = P · D · P⁻¹, then A² = P · D · P⁻¹ · P · D · P⁻¹. The middle P⁻¹ · P becomes the identity and disappears, leaving P · D² · P⁻¹. The pattern generalizes to any power k:
- A^k = P · D^k · P⁻¹.
- D^k is trivial to compute because you only raise the diagonal entries to the k-th power.
- P and P⁻¹ stay the same no matter how high k goes.
This trick shows up in population models, Markov chains, differential equations, and anywhere repeated linear operations appear. Try it yourself: take the A from the example and square it using diagonalization.
Why diagonalize a matrix? Because powers, exponentials, and repeated transformations become trivial when the matrix is expressed as pure scaling in an eigenvector basis.
When is a matrix not diagonalizable?
Not every matrix can be diagonalized. The whole process depends on being able to build P, and P needs enough independent eigenvectors to form a valid basis.
The rule is precise: an n×n matrix A is diagonalizable if and only if it has n linearly independent eigenvectors. The ideal scenario is when a matrix has n distinct eigenvalues, because that automatically guarantees n linearly independent eigenvectors. That's the case in the example above.
The trouble starts with repeated eigenvalues. When eigenvalues repeat, the matrix might not have enough independent eigenvectors to fill out P, and in that case diagonalization simply isn't possible. You'd need other tools, like the Jordan form, to handle those situations.
You started this course thinking of a vector as an arrow and you're ending it seeing transformations as structured systems with a hidden geometric language. That language powers video games, engineering, and data analysis, and it's the foundation for almost every quantitative field you'll touch next. ¿Qué aplicación de la diagonalización quieres explorar primero? Cuéntalo en los comentarios.