Lecture 25: Diagonalization and Eigenvalues
Last time we learned that a matrix \(A\) is diagonalizable if \(A\) has \(n\) linearly independent eigenvectors. In other words if we can form a basis consisting of eigenvectors \(\beta = \{v_1,...,v_n\}\).
In that case, we will see that the matrix representative of \(L_A\) with respect to basis \(\beta\) is a diagonal matrix where the diagonal entries are those eigenvalues corresponding to the eigenvectors
\(v_j\) is an eigenvector and \(\lambda_j\) is called an eigenvalue. And so to diagonalize a matrix, we need to find these eigenvectors. Last time we developed a plan for this:
- Find these eigenvalues \(\lambda_1, ..., \lambda_2\)
- Find a basis for \(\beta_i\) for each eigenspace \(E_{\lambda_i}\)
- If \(\sum_{j=1}^{k}\dim(E_{\lambda_i}) = n\), then collect all the separate bases and that's our basis \(\beta = \beta_1 \cup ... \cup \beta_k\). This step works because we proved in the last lecture the corollary \(\lambda_1 \neq \lambda_2 \rightarrow \beta_1 \cup \beta_2\) is linearly independent.
Examples of Non-Diagonalizable Matrices
Of course the plan above might not work and there are different ways, a matrix could fail to be diagonalized.
This matrix has no eigenvalues so it can’t be diagonalized.
This matrix has only one eigen value.
This implies
So we don’t have enough linearly independent vectors in \(\beta\).
Is There a Better Diagonalization Test?
Today, we will refine our answer to the question “Is \(A\) diagonalizable?”
In other words, can completely factorize the polynomial?
Example 1: \(t^2 + 1\) doesn’t split over \(\mathbf{R}\). It does however split over \(\mathbf{C} as (t - i)(t + i)\).
Example 2: \(t^2 - 2t + 1 = (t - 1)(t - 1)\) splits over \(\mathbf{R}\).
So now what does splitting have to do with diagonalizability? The following theorem explains this
Note that the the converse is false. If the characteristic polynomial splits over \(\mathbf{R}\), it doesn’t necessarily means that \(A\) is diagonalizable. (See example 2 above)
Eigenvalues of Similar Matrices
Before going into the proof of theorem 1 above, the following proposition will be useful
This means that if \(A\) and \(B\) are similar, then they have the same eigenvalues.
Proof:
Observe that
Therefore,
Note here that in step 2, \(QtI_nQ^{-1} = tQQ^{-1} = tI_n\)
Proof of Theorem 1 (5.6)
Suppose that \(A\) is diagonalizable. This means that there exists a basis \(\beta\) such that
If we let \(\alpha\) be the standard basis, then know that there exists a change of coordinate matrix \(Q = [I]^{\alpha}_{\beta}\) such that
This means that \([L_A]_{\beta}^{\beta}\) and \(A\) are similar matrices. But by the previous proposition, this means that we have
On the other hand,
Algebraic and Geometric Multiplicities of Eigenvalues
There is another condition for diagonalizability but we need a few more definitions.
An an example if \(A =
\begin{pmatrix}
1 & 1 \\
0 & 1
\end{pmatrix}\) then, \(\lambda = 1\) has algebraic multiplicity 2 because \(\det(A - I_n) = (1-t)^2\).
For the same example above. The geometric multiplicity of \(\lambda = 1\) is 1 because the nullspace is spanned by one vector.
Given these definitions we can now introduce the following theorem.
Proof
Let \(\lambda\) be an eigenvalue of \(A \in M_{n \times n}\) and let the geometric multiplicity of \(\lambda\) be \(k\). Now the goal is to relate the characteristic polynomial of \(\lambda\) to the dimension of \(E_{\lambda}\). By definition, we know that \(\dim(E_{\lambda})\) is \(k\). (definition above)
Let \(\{v_1,...,v_k\}\) be a basis for \(E_{\lambda}\). We know that \(\dim(E_{\lambda})=k\). We also know that \(k \leq n\) so extend this basis to a basis for \(\mathbf{R}^n\) so
Again, we are seeking a relationship between the dimension of the eigenspace and the characteristic polynomial of \(A\). We don’t know if \([L_A]_{\beta}^{\beta}\) is diagonalizable yet. But we do know by definition that
If we only look at the first \(k\) columns of this matrix, they will then be consisting of these eigenvectors so it will be
If we organize this matrix into blocks, we will see that
where \(O\) is the zero matrix and \(B\) and \(C\) are unknown. So we can write \(\det([L_A]_{\beta}^{\beta} - tI_n)\) as
But by the proposition we introduced earlier, since \(A\) and \([L_A]_{\beta}^{\beta}\) are similar, then
Why is this true? The top left section just consists of \(\lambda - t\) entries on the diagonal while the rest is zero. Hence, the algebraic multiplicity of \(\lambda\) is at least \(k\).
A More Refined Test for Diagonalizability
We’re finally ready to present the more refined test for diagonalizability.
- \(\det(A - tI_n)\) splits over \(\mathbf{R}\)
- For each eigenvalue \(\lambda\), geometric multiplicity = algebraic multiplicity
Proof
\(\Rightarrow:\)
Suppose that \(A\) is diagonalizable. This means that we have a basis \(\beta = \{v_1,...,v_n\}\) consisting entirely of eigenvectors. Now we need to prove that \(a\) and \(b\) both hold. Last time we proved that \(a\) holds. This means that the characteristic polynomial splits over \(R\) and so
where \(\sum_{j=1}^{k} m_j = n\). Now, given \(\lambda_j\), let \(k_j\) be the geometric multiplicity of \(\lambda_j\). We know by definition that \(k_j = \dim(E_{\lambda_j})\). The goal is to prove that \(k_j = \dim(E_{\lambda_j}) = m_j\).
Last time we proved that the geometric multiplicity is always less than or equal to the algebraic multiplicity of any \(\lambda\). Therefore, we have \(k_j \leq m_j\). Moreover, each of the eigenvectors in \(\beta\) must belong to one of our eigenspaces so let \(l_j = num(\{v_i \in \beta \ | \ v_i \in E_{\lambda_j}\})\). That is \(l_j\) is the number of eigenvectors in \(\beta\) that belong to the eigenspace \(E_{\lambda_j}\). But we only have \(n\) eigenvectors in \(\beta\), therefore,
We also know that
This is because the number of vectors in the eigenspace that are selected from the basis can be at most the dimension of the eigenspace itself. Putting the above observations together, we see that
Therefore, \(k_j = m_j\) for \(j = 1,2,...,k\).
\(\Leftarrow:\) We’re given (a) and (b). One way to prove that \(A\) is diagonalizable is by constructing a basis of eigenvectors. So let \(\beta_j\) be a basis for \(E_{\lambda_j}\). Set \(\beta = \beta_1 \cup \beta_2 \cup ... \cup \beta_k\). By the homework, \(\beta\) is linearly independent and by construction, it consists of eigenvectors. It remains to show that the number of vectors in \(\beta\) is \(n\). But from (b), we know that
Because in (b) we said that the algebraic multiplicity must be equal to the geometric multiplicity. And from (a), we know that
So \(\beta\) is the desired basis and \(A\) must be diagonalizable. \(\ \blacksquare\)
One final note here: we’ve developed several tests including the above one to test matrices for diagonalizability. What about general linear transformations \(T: V \rightarrow V\)? well we can find the matrix representable of \(T\) with respect to basis \(\gamma\) so \([T]_{\gamma}^{\gamma}\). Check if this matrix is diagonalizable for any basis \(\gamma\).
Computing \(A^k\)
If \(A\) is diagonalizable, then
This allows us to compute \(A^k\) easily since
References
- Math416 by Ely Kerman