← mapVector Spaces

Diagonalization

⚗ Dr. Möbius, from the lab

Last lesson you found the directions a matrix merely stretches. Now we cash that goddamn check. If you build your coordinate grid out of eigenvectors, the matrix sheds every costume and stands there naked as a diagonal matrix — pure, axis-aligned scaling, no bullshit, no shear. Change of basis has been waiting this whole course to make exactly this move: A=PDP1A = PDP^{-1}.

THE BIG IDEA

A is diagonalizable when it has a full basis of eigenvectors; then A = PDP⁻¹ with eigenvectors in the columns of P and eigenvalues on the diagonal of D, which makes powers Aᵏ = PDᵏP⁻¹ trivial.

The dream basis

From Change of Basis you know that the matrix of a map in a new basis B\mathcal{B} is P1APP^{-1}AP, where PP has the new basis vectors as columns. And we ended that lesson with a craving: find the basis that makes the matrix diagonal, because diagonal means pure scaling — no mixing, no shear, just "stretch axis one by this, axis two by that."

Now we know which basis to pick: a basis of eigenvectors. Watch why it's forced, because this is the moment the whole stratum clicks. Suppose {v1,,vn}\{v_1, \dots, v_n\} are eigenvectors forming a basis, with Avi=λiviA v_i = \lambda_i v_i. Put them in the columns of PP. Then

AP=A(v1vn)=(λ1v1λnvn)=PD,AP = A\begin{pmatrix} | & & | \\ v_1 & \cdots & v_n \\ | & & | \end{pmatrix} = \begin{pmatrix} | & & | \\ \lambda_1 v_1 & \cdots & \lambda_n v_n \\ | & & | \end{pmatrix} = P D,

where D=diag(λ1,,λn)D = \operatorname{diag}(\lambda_1, \dots, \lambda_n) is the diagonal matrix of eigenvalues (the ii-th column of PDPD is λivi\lambda_i v_i — work it out, it's just "scale each column"). So AP=PDAP = PD, and since the eigenvectors are a basis, PP is invertible, giving the two faces of the same fact:

A=PDP1,D=P1AP.A = P D P^{-1}, \qquad D = P^{-1} A P.

We say AA is diagonalizable. The columns of PP are eigenvectors; the diagonal of DD holds their eigenvalues, in the matching order — column ii of PP pairs with entry ii of DD. Get that pairing wrong and the whole thing breaks.

Why you'd kill for diagonal: powers

Here's the payoff that makes diagonalization more than a parlor trick, and it's one of the most satisfying things in this entire course. Compute A2A^2:

A2=(PDP1)(PDP1)=PD(P1P)IDP1=PD2P1.A^2 = (PDP^{-1})(PDP^{-1}) = PD\underbrace{(P^{-1}P)}_{I}DP^{-1} = P D^2 P^{-1}.

The inner P1PP^{-1}P collapses to the identity. The same telescoping happens for any power:

Ak=PDkP1.A^k = P D^k P^{-1}.

And DkD^k is free — you just raise each diagonal entry to the kk: diag(λ1,,λn)k=diag(λ1k,,λnk)\operatorname{diag}(\lambda_1, \dots, \lambda_n)^k = \operatorname{diag}(\lambda_1^k, \dots, \lambda_n^k). No matrix multiplication at all. This should feel like cheating. This is how you compute A100A^{100} without a hundred multiplications, how population models project decades forward, and how the Fibonacci numbers get a closed form: the Fibonacci recurrence is (Fn+1Fn)=(1110)(FnFn1)\begin{pmatrix} F_{n+1} \\ F_n \end{pmatrix} = \begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix}\begin{pmatrix} F_n \\ F_{n-1} \end{pmatrix}, and diagonalizing that matrix turns "add the last two numbers a million times" into raising two eigenvalues to a power. (Those eigenvalues are 1±52\frac{1 \pm \sqrt 5}{2} — the golden ratio falls out of a 2×22\times 2.) Drag the eigendirections and watch the stretch factors that powers will amplify:

eigen lab — hunt the special directions
vAv
A = [4 1 | 2 3]λ = 5, 2

When does it work?

Diagonalization needs nn linearly independent eigenvectors — a full basis of them. So:

  • Distinct eigenvalues guarantee it. If an n×nn\times n matrix has nn different eigenvalues, the eigenvectors are automatically independent (we proved this last lesson), hence a basis. Diagonalizable, no further checking.
  • Repeated eigenvalues are a maybe. A repeated eigenvalue might still supply enough independent eigenvectors (the identity matrix has eigenvalue 11 repeated and is already diagonal), or it might not.

When it doesn't, the matrix is defective — there simply aren't enough eigenvectors to fill a basis, so no eigenbasis exists and AA cannot be diagonalized. The Federation of Boring Textbook Authors sweeps this under the rug; I'm not going to. The classic culprit is a shear:

S=(1101).S = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}.

Its only eigenvalue is 11 (repeated), and solving (SI)v=0(S - I)v = 0 gives (0100)v=0\begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}v = 0, forcing the second component to zero: the only eigenvectors are multiples of (10)\begin{pmatrix} 1 \\ 0 \end{pmatrix}. One eigendirection, not two. The shear genuinely tilts the plane in a way no single stretch can undo — it has no eigenbasis, full stop. Defective matrices are real, and pretending otherwise is how people produce nonsense.

The full 2×2 workflow

To diagonalize a 2×22\times 2 matrix AA:

  1. Eigenvalues. Solve det(AλI)=0\det(A - \lambda I) = 0, i.e. λ2(trace)λ+det=0\lambda^2 - (\operatorname{trace})\lambda + \det = 0.
  2. Eigenvectors. For each λ\lambda, solve (AλI)v=0(A - \lambda I)v = 0.
  3. Independence check. Distinct λ\lambda's ⟹ automatically independent. Repeated λ\lambda ⟹ verify you actually got two independent eigenvectors; if not, it's defective — stop.
  4. Assemble. P=P = eigenvectors as columns, D=D = eigenvalues on the diagonal in the matching order.
  5. Verify AP=PDAP = PD (cheaper than computing P1P^{-1}, and catches order mistakes instantly).

For a 3×3, the structure is identical, just bigger: a degree-3 characteristic polynomial gives up to three eigenvalues, and you need three independent eigenvectors total across all the eigenspaces to fill PP. Same workflow, more arithmetic.

The geometric punchline

Strip away the algebra and here's what diagonalization says: in eigen-coordinates, every diagonalizable map is just axis-aligned stretching. The complicated entries of AA in the standard basis are an illusion of a badly-chosen grid — a lie the standard basis tells about the map. Rotate your head into the eigenbasis and the map becomes the simplest fucking thing imaginable — multiply each axis by a number. Every diagonalizable transformation, no matter how scrambled it looks, is secretly that simple. That realization is the entire reward of the matrices and spaces strata, and the spectral theorem (next, and last) makes it perfect for symmetric matrices. Go do the gauntlet.

🔬 SPECIMENS (worked examples)

Worked example 1 — diagonalize a 2×2, start to finish

Diagonalize A=(4123)A = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}: find PP and DD with A=PDP1A = PDP^{-1}, and verify.

Step 1 — eigenvalues. trace=7\operatorname{trace} = 7, det=10\det = 10, so λ27λ+10=(λ2)(λ5)=0\lambda^2 - 7\lambda + 10 = (\lambda - 2)(\lambda - 5) = 0: eigenvalues λ=5\lambda = 5 and λ=2\lambda = 2.

Step 2 — eigenvectors. For λ=5\lambda = 5: A5I=(1122)A - 5I = \begin{pmatrix} -1 & 1 \\ 2 & -2 \end{pmatrix}, giving y=xy = x, so v1=(11)v_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. For λ=2\lambda = 2: A2I=(2121)A - 2I = \begin{pmatrix} 2 & 1 \\ 2 & 1 \end{pmatrix}, giving y=2xy = -2x, so v2=(12)v_2 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}.

Step 3 — independence. Distinct eigenvalues, so v1,v2v_1, v_2 are automatically independent — a basis. Diagonalizable.

Step 4 — assemble, keeping the order matched (v15v_1 \leftrightarrow 5, v22v_2 \leftrightarrow 2):

P=(1112),D=(5002).P = \begin{pmatrix} 1 & 1 \\ 1 & -2 \end{pmatrix}, \qquad D = \begin{pmatrix} 5 & 0 \\ 0 & 2 \end{pmatrix}.

Step 5 — verify AP=PDAP = PD.

AP=(4123)(1112)=(5254),PD=(1112)(5002)=(5254).AP = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}\begin{pmatrix} 1 & 1 \\ 1 & -2 \end{pmatrix} = \begin{pmatrix} 5 & 2 \\ 5 & -4 \end{pmatrix}, \qquad PD = \begin{pmatrix} 1 & 1 \\ 1 & -2 \end{pmatrix}\begin{pmatrix} 5 & 0 \\ 0 & 2 \end{pmatrix} = \begin{pmatrix} 5 & 2 \\ 5 & -4 \end{pmatrix}.

They match. \checkmark So A=PDP1A = PDP^{-1} with the PP and DD above. The verification via AP=PDAP = PD never needed P1P^{-1} — that's the cheap, mistake-proof check.

Worked example 2 — powers without sweat

Let A=(0211)A = \begin{pmatrix} 0 & 2 \\ 1 & 1 \end{pmatrix}. Use diagonalization to compute Ak(41)A^k \begin{pmatrix} 4 \\ 1 \end{pmatrix} for any kk.

Eigenvalues. trace=1\operatorname{trace} = 1, det=2\det = -2, so λ2λ2=(λ2)(λ+1)=0\lambda^2 - \lambda - 2 = (\lambda - 2)(\lambda + 1) = 0: λ=2\lambda = 2 and λ=1\lambda = -1.

Eigenvectors. λ=2\lambda = 2: A2I=(2211)A - 2I = \begin{pmatrix} -2 & 2 \\ 1 & -1 \end{pmatrix}, so y=xy = x, v1=(11)v_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. λ=1\lambda = -1: A+I=(1212)A + I = \begin{pmatrix} 1 & 2 \\ 1 & 2 \end{pmatrix}, so x=2yx = -2y, v2=(21)v_2 = \begin{pmatrix} 2 \\ -1 \end{pmatrix}.

Decompose the start vector in the eigenbasis. Solve (41)=av1+bv2\begin{pmatrix} 4 \\ 1 \end{pmatrix} = a v_1 + b v_2: a+2b=4a + 2b = 4 and ab=1a - b = 1. Subtract: 3b=3b=13b = 3 \Rightarrow b = 1, then a=2a = 2. So (41)=2(11)+1(21).\begin{pmatrix} 4 \\ 1 \end{pmatrix} = 2\begin{pmatrix} 1 \\ 1 \end{pmatrix} + 1\begin{pmatrix} 2 \\ -1 \end{pmatrix}.

Apply AkA^k. On an eigenvector, Akvi=λikviA^k v_i = \lambda_i^k v_i — that's the whole magic of diagonalization, applied directly:

Ak(41)=22k(11)+1(1)k(21)=(2k+1+2(1)k2k+1(1)k).A^k\begin{pmatrix} 4 \\ 1 \end{pmatrix} = 2 \cdot 2^k \begin{pmatrix} 1 \\ 1 \end{pmatrix} + 1\cdot(-1)^k \begin{pmatrix} 2 \\ -1 \end{pmatrix} = \begin{pmatrix} 2^{k+1} + 2(-1)^k \\ 2^{k+1} - (-1)^k \end{pmatrix}.

Sanity check at k=1k = 1: (4+(2)4+1)=(25)\begin{pmatrix} 4 + (-2) \\ 4 + 1 \end{pmatrix} = \begin{pmatrix} 2 \\ 5 \end{pmatrix}, and directly A(41)=(25)A\begin{pmatrix} 4 \\ 1 \end{pmatrix} = \begin{pmatrix} 2 \\ 5 \end{pmatrix}. \checkmark For large kk the 2k2^k term dominates — the system's long-run behavior is governed by the largest eigenvalue. That single observation is the engine behind population growth, PageRank, and Markov chains.

Worked example 3 — the trap: a matrix that refuses to cooperate

Is the shear S=(1101)S = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} diagonalizable? Decide rigorously.

Eigenvalues. SS is upper-triangular, so the eigenvalues are the diagonal entries: λ=1\lambda = 1, repeated (characteristic polynomial (1λ)2(1 - \lambda)^2).

Eigenvectors. Solve (SI)v=0(S - I)v = 0:

SI=(0100),(0100)(xy)=(y0)=0y=0.S - I = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \qquad \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} y \\ 0 \end{pmatrix} = 0 \Rightarrow y = 0.

So the eigenvectors are exactly the nonzero multiples of (10)\begin{pmatrix} 1 \\ 0 \end{pmatrix} — a one-dimensional eigenspace. There is no second independent eigenvector.

Verdict. A 2×22\times2 matrix needs two independent eigenvectors to be diagonalized; SS supplies only one. So SS is defective — not diagonalizable. The trap is assuming the repeated eigenvalue λ=1\lambda = 1 would hand over two directions; it doesn't. Geometrically the shear slides the plane sideways, and no choice of axes turns a slide into a pure stretch. Defective matrices are not a bug in your arithmetic — they genuinely exist, and recognizing one is the mark of someone who actually understands this.

☠ KNOWN HAZARDS

  • Mismatching the columns of PP with the diagonal of DD. Column ii of PP must be an eigenvector for the eigenvalue in entry (i,i)(i,i) of DD. Swap one and A=PDP1A = PDP^{-1} silently produces the wrong matrix — and it looks perfectly fine until you check. Always verify AP=PDAP = PD column by column.

  • Assuming every matrix is diagonalizable. Defective matrices exist and they will bite you. A repeated eigenvalue is a red flag — you must check whether its eigenspace is big enough. The shear (1101)\begin{pmatrix}1&1\\0&1\end{pmatrix} has only one eigendirection and is not diagonalizable over any field. No amount of clever algebra changes this fact.

  • Computing AkA^k as P1DkPP^{-1}D^kP. It's PDkP1PD^kP^{-1} — eigenvectors out front. Mixing up which side PP goes on is the same PP-vs-P1P^{-1} confusion from change of basis; remember A=PDP1A = PDP^{-1} and everything follows.

  • Forgetting that eigenvectors can be rescaled. Any nonzero multiple of an eigenvector is an eigenvector, so PP is not unique — but once you pick the columns, DD's order is locked to them. Different valid PP's give the same AA; the freedom is real but the pairing is rigid.

TL;DR

  • AA is diagonalizable iff it has a basis of eigenvectors; then A=PDP1A = PDP^{-1} with eigenvectors as columns of PP and eigenvalues on DD's diagonal, in matching order.

  • Powers are trivial: Ak=PDkP1A^k = PD^kP^{-1}, and DkD^k just raises each diagonal entry to the kk. This is the Fibonacci / population-model payoff.

  • Distinct eigenvalues ⟹ guaranteed diagonalizable. Repeated eigenvalues ⟹ maybe (check for enough independent eigenvectors).

  • Defective matrices (like the shear (1101)\begin{pmatrix}1&1\\0&1\end{pmatrix}) lack a full eigenbasis and cannot be diagonalized.

  • Geometrically: in eigen-coordinates a diagonalizable map is pure axis-aligned stretching. Every such map is secretly that simple.

unlocks