Last time we learned that a matrix \(A\) is diagonalizable if \(A\) has \(n\) linearly independent eigenvectors. In other words if we can form a basis consisting of eigenvectors \(\beta = \{v_1,...,v_n\}\).

In that case, we will see that the matrix representative of \(L_A\) with respect to basis \(\beta\) is a diagonal matrix where the diagonal entries are those eigenvalues corresponding to the eigenvectors

$$ \begin{align*} [L_A]_{\beta}^{\beta} = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix}, Av_j = \lambda_j v_j \end{align*} $$

\(v_j\) is an eigenvector and \(\lambda_j\) is called an eigenvalue. And so to diagonalize a matrix, we need to find these eigenvectors. Last time we developed a plan for this:

  1. Find these eigenvalues \(\lambda_1, ..., \lambda_2\)
  2. Find a basis for \(\beta_i\) for each eigenspace \(E_{\lambda_i}\)
  3. If \(\sum_{j=1}^{k}\dim(E_{\lambda_i}) = n\), then collect all the separate bases and that's our basis \(\beta = \beta_1 \cup ... \cup \beta_k\). This step works because we proved in the last lecture the corollary \(\lambda_1 \neq \lambda_2 \rightarrow \beta_1 \cup \beta_2\) is linearly independent.


Examples of Non-Diagonalizable Matrices

Of course the plan above might not work and there are different ways, a matrix could fail to be diagonalized.

$$ \begin{align*} A = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \end{align*} $$

This matrix has no eigenvalues so it can’t be diagonalized.

$$ \begin{align*} B = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} \end{align*} $$

This matrix has only one eigen value.

$$ \begin{align*} det(A - tI_2) = \begin{pmatrix} 1-t & 1 \\ 0 & 1-t \end{pmatrix} = (1-t)^2 = 0 \rightarrow t = 1 \end{align*} $$

This implies

$$ \begin{align*} E_{\lambda_1} = N(B - 1I_n) = N \left( \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} \right) = \{ (t, 0) \ | \ t \in \mathbf{R} \} \end{align*} $$

So we don’t have enough linearly independent vectors in \(\beta\).



Is There a Better Diagonalization Test?

Today, we will refine our answer to the question “Is \(A\) diagonalizable?”

Definition
A polynomial \(f(t)\) splits over \(\mathbf{R}\) if there are scalars \(c,a_1,...,a_k \in \mathbf{R}\) such that $$ \begin{align*} f(t) = c(t-a_1)(t - a_2)...(t-a_k) \end{align*} $$


In other words, can completely factorize the polynomial?

Example 1: \(t^2 + 1\) doesn’t split over \(\mathbf{R}\). It does however split over \(\mathbf{C} as (t - i)(t + i)\).

Example 2: \(t^2 - 2t + 1 = (t - 1)(t - 1)\) splits over \(\mathbf{R}\).

So now what does splitting have to do with diagonalizability? The following theorem explains this

Theorem 1 (5.6)
If \(A\) is diagonalizable, then its characteristic polynomial splits over \(\mathbf{R}\)


Note that the the converse is false. If the characteristic polynomial splits over \(\mathbf{R}\), it doesn’t necessarily means that \(A\) is diagonalizable. (See example 2 above)



Eigenvalues of Similar Matrices

Before going into the proof of theorem 1 above, the following proposition will be useful

Proposition
If \(B = Q^{-1}AQ\), then $$ \begin{align*} \det(B - tI_n) = \det(A - tI_n) \end{align*} $$


This means that if \(A\) and \(B\) are similar, then they have the same eigenvalues.

Proof:

Observe that

$$ \begin{align*} 1 = \det(I_n) = \det(QQ^{-1}) = \det(Q)\det(Q^{-1}). \end{align*} $$

Therefore,

$$ \begin{align*} \det(B - tI_n) &= \det(Q^{-1}AQ - tI_n) \\ &= \det(Q^{-1}AQ - QtI_nQ^{-1}) \\ &= \det(Q (A - tI_n)Q^{-1}) \text{ (factor out Q on the left ..) }\\ &= \det(Q) \det((A - tI_n)) \det(Q^{-1}) \\ &= \det(Q) \det(Q^{-1}) \det((A - tI_n)) \text{ (they are just real numbers)}\\ &= \det((A - tI_n)). \ \blacksquare \\ \end{align*} $$

Note here that in step 2, \(QtI_nQ^{-1} = tQQ^{-1} = tI_n\)



Proof of Theorem 1 (5.6)

Suppose that \(A\) is diagonalizable. This means that there exists a basis \(\beta\) such that

$$ \begin{align*} [L_A]_{\beta}^{\beta} = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix}, Av_j = \lambda_j v_j \end{align*} $$

If we let \(\alpha\) be the standard basis, then know that there exists a change of coordinate matrix \(Q = [I]^{\alpha}_{\beta}\) such that

$$ \begin{align*} [L_A]_{\beta}^{\beta} &= [I]_{\alpha}^{\beta}[L_A]_{\alpha}^{\alpha}[I]^{\alpha}_{\beta} \\ &= Q^{-1}AQ. \end{align*} $$

This means that \([L_A]_{\beta}^{\beta}\) and \(A\) are similar matrices. But by the previous proposition, this means that we have

$$ \begin{align*} \det ([L_A]_{\beta}^{\beta} -tI_n) &= \det(A - tI_n). \end{align*} $$

On the other hand,

$$ \begin{align*} \det ([L_A]_{\beta}^{\beta} -tI_n) &= \begin{pmatrix} \lambda_1 -t & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n -t \end{pmatrix} \\ &= (\lambda_1 - t)...(\lambda_n - t). \ \blacksquare \end{align*} $$




Algebraic and Geometric Multiplicities of Eigenvalues

There is another condition for diagonalizability but we need a few more definitions.

Definition
The algebraic multiplicity of \(\lambda\) is the number of times \(\lambda - t\) divides \(\det(A - tI_n)\).


An an example if \(A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}\) then, \(\lambda = 1\) has algebraic multiplicity 2 because \(\det(A - I_n) = (1-t)^2\).

Definition
The geometric multiplicity of \(\lambda\) is \(\dim(E_{\lambda})\).


For the same example above. The geometric multiplicity of \(\lambda = 1\) is 1 because the nullspace is spanned by one vector.

Given these definitions we can now introduce the following theorem.

Theorem 2 (5.7)
Geometric multiplicity of \(\lambda \leq\) the algebraic multiplicity of \(\lambda\).


Proof
Let \(\lambda\) be an eigenvalue of \(A \in M_{n \times n}\) and let the geometric multiplicity of \(\lambda\) be \(k\). Now the goal is to relate the characteristic polynomial of \(\lambda\) to the dimension of \(E_{\lambda}\). By definition, we know that \(\dim(E_{\lambda})\) is \(k\). (definition above)

Let \(\{v_1,...,v_k\}\) be a basis for \(E_{\lambda}\). We know that \(\dim(E_{\lambda})=k\). We also know that \(k \leq n\) so extend this basis to a basis for \(\mathbf{R}^n\) so

$$ \begin{align*} \beta = \{v_1, ..., v_k, v_{k+1},...,v_n\}. \end{align*} $$

Again, we are seeking a relationship between the dimension of the eigenspace and the characteristic polynomial of \(A\). We don’t know if \([L_A]_{\beta}^{\beta}\) is diagonalizable yet. But we do know by definition that

$$ \begin{align*} [L_A]_{\beta}^{\beta} &= \begin{pmatrix} [Av_1]_{\beta} & \cdots & [Av_n]_{\beta} \end{pmatrix} \\ &= \begin{pmatrix} [\lambda v_1]_{\beta} & \cdots & [\lambda v_k]_{\beta} & [Av_{k+1}]_{\beta} & \cdots & [Av_n]_{\beta} \end{pmatrix} \end{align*} $$

If we only look at the first \(k\) columns of this matrix, they will then be consisting of these eigenvectors so it will be

$$ \begin{align*} [L_A]_{\beta}^{\beta} &= \begin{pmatrix} \lambda_1 & 0 & 0 & 0 & \cdots \\ 0 & \lambda_2 & 0 & 0 & \cdots \\ \vdots & 0 & \ddots & 0 & \cdots \\ 0 & \vdots & 0 & \lambda_k & \cdots \\ 0 & 0 & 0 & 0 \\ \end{pmatrix} \end{align*} $$

If we organize this matrix into blocks, we will see that

$$ \begin{align*} [L_A]_{\beta}^{\beta} &= \begin{pmatrix} \lambda I_k & B \\ O & C \end{pmatrix} \end{align*} $$

where \(O\) is the zero matrix and \(B\) and \(C\) are unknown. So we can write \(\det([L_A]_{\beta}^{\beta} - tI_n)\) as

$$ \begin{align*} \det([L_A]_{\beta}^{\beta} - tI_n) &= (\lambda - t)^k \det(\text{unknown part of the matrix}) \end{align*} $$

But by the proposition we introduced earlier, since \(A\) and \([L_A]_{\beta}^{\beta}\) are similar, then

$$ \begin{align*} \det(A - tI_n) &= \det([L_A]_{\beta}^{\beta} - tI_n) \\ &= (\lambda - t)^k \det(\text{unknown part of the matrix}) \end{align*} $$

Why is this true? The top left section just consists of \(\lambda - t\) entries on the diagonal while the rest is zero. Hence, the algebraic multiplicity of \(\lambda\) is at least \(k\).



A More Refined Test for Diagonalizability

We’re finally ready to present the more refined test for diagonalizability.

Theorem (5.8(a))
\(A\) is diagonalizable if and only if
  • \(\det(A - tI_n)\) splits over \(\mathbf{R}\)
  • For each eigenvalue \(\lambda\), geometric multiplicity = algebraic multiplicity


Proof

\(\Rightarrow:\) Suppose that \(A\) is diagonalizable. This means that we have a basis \(\beta = \{v_1,...,v_n\}\) consisting entirely of eigenvectors. Now we need to prove that \(a\) and \(b\) both hold. Last time we proved that \(a\) holds. This means that the characteristic polynomial splits over \(R\) and so

$$ \begin{align*} \det(A - tI_n) = (t - \lambda_1)^{m_1}...(t - \lambda_k)^{m_k} \end{align*} $$

where \(\sum_{j=1}^{k} m_j = n\). Now, given \(\lambda_j\), let \(k_j\) be the geometric multiplicity of \(\lambda_j\). We know by definition that \(k_j = \dim(E_{\lambda_j})\). The goal is to prove that \(k_j = \dim(E_{\lambda_j}) = m_j\).

Last time we proved that the geometric multiplicity is always less than or equal to the algebraic multiplicity of any \(\lambda\). Therefore, we have \(k_j \leq m_j\). Moreover, each of the eigenvectors in \(\beta\) must belong to one of our eigenspaces so let \(l_j = num(\{v_i \in \beta \ | \ v_i \in E_{\lambda_j}\})\). That is \(l_j\) is the number of eigenvectors in \(\beta\) that belong to the eigenspace \(E_{\lambda_j}\). But we only have \(n\) eigenvectors in \(\beta\), therefore,

$$ \begin{align*} n = \sum_{j=1}^k l_j \end{align*} $$

We also know that

$$ \begin{align*} l_j \leq k_j \end{align*} $$

This is because the number of vectors in the eigenspace that are selected from the basis can be at most the dimension of the eigenspace itself. Putting the above observations together, we see that

$$ \begin{align*} n = \sum_{j=1}^k l_j \leq \sum_{j=1}^k k_j \leq \sum_{j=1}^k m_j = n \end{align*} $$

Therefore, \(k_j = m_j\) for \(j = 1,2,...,k\).

\(\Leftarrow:\) We’re given (a) and (b). One way to prove that \(A\) is diagonalizable is by constructing a basis of eigenvectors. So let \(\beta_j\) be a basis for \(E_{\lambda_j}\). Set \(\beta = \beta_1 \cup \beta_2 \cup ... \cup \beta_k\). By the homework, \(\beta\) is linearly independent and by construction, it consists of eigenvectors. It remains to show that the number of vectors in \(\beta\) is \(n\). But from (b), we know that

$$ \begin{align*} num(\beta) = \sum_{j=1}^k l_j = \sum_{j=1}^k k_j = \sum_{j=1}^k m_j \end{align*} $$

Because in (b) we said that the algebraic multiplicity must be equal to the geometric multiplicity. And from (a), we know that

$$ \begin{align*} \sum_{j=1}^k m_j = n \end{align*} $$

So \(\beta\) is the desired basis and \(A\) must be diagonalizable. \(\ \blacksquare\)

One final note here: we’ve developed several tests including the above one to test matrices for diagonalizability. What about general linear transformations \(T: V \rightarrow V\)? well we can find the matrix representable of \(T\) with respect to basis \(\gamma\) so \([T]_{\gamma}^{\gamma}\). Check if this matrix is diagonalizable for any basis \(\gamma\).



Computing \(A^k\)

If \(A\) is diagonalizable, then

$$ \begin{align*} A = QDQ^{-1} \text { where } D = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix} \end{align*} $$

This allows us to compute \(A^k\) easily since

$$ \begin{align*} A^k = QD^kQ^{-1} \text { where } D^k = \begin{pmatrix} \lambda_1^k & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n^k \end{pmatrix} \end{align*} $$




References

  • Math416 by Ely Kerman