Lecture 37/38: The Jordan Canonical Form and Generalized Eigenvectors
Last time we proved that If \(T: V \rightarrow V\) is self-adjoint, then there is an orthonormal basis \(\beta\) of \(V\) consisting of eigenvectors of \(T\). This therefore lead to the conclusion that \(T\) is diagonalizable.
A Test For Diagonalizability
We studied previously a few ways a linear operator can be tested for diagonalizability. From lecture 25,
- The characteristic polynomial splits over its field \(\mathbf{R}\) or \(\mathbf{C}\)
- For each eigenvalue of \(T\), its geometric multiplicity (\(\dim(E_{\lambda})\)) = algebraic multiplicity
We mentioned last time too that if \(V\) is over \(\mathbf{C}\), the the characteristic polynomial always splits. Note also that you can have \((a)\) but not \((b)\). For example
splits but doesn’t satisfy \((b)\)
Jordan Canonical Form
Based on the previous observation. It turns out there is a nice form that we can put \(T\) into in order to achieve \((a)\) and \((b)\)
But what is a Jordan Canonical form? We first define a Jordan Block as follows
So they’re almost diagonal but not really. For example, for a \(2 \times 2\) and \(3 \times 3\) matrices, a Jordan block looks like
Using these Jordan blocks, we can now define what the Jordan Canonical form is
You can think of this matrix as more of a generalization of a diagonal matrix.
Examples
The following are examples of matrix in Jordan Canonical Form
Note here that the characteristic polynomial of both \(A\) and \(B\) is \((1-t)^3(2-t)^2\).
Computing Powers of Matrices in JFC
It turns out that we can write a formula for powers of matrices in JFC. It is not as easy as taking the power of a diagonal matrix but at least have a formula.
Fact 1:
Note here that the third entry in the first row for example is \(\frac{k(k-1)}{2!}\lambda^{k-1}\)
Fact 2: If \(A\) is in Jordan Canonical form, then the power of the matrix is as follows
Proof of JCF Theorem
So now we see that matrices in JCF are useful and we have enough motivation to prove the theorem above which again states that if the characteristic polynomial splits, then there is a basis such that \([T]^{\beta}_{\beta}\) is in JFC! To prove it, let \(\beta = \{v_1, ..., v_n\}\) of \(V\). We this basis to be such that
These vectors \(v_1,...,v_j\) are not necessarily eigenvectors. What are they? Let’s focus on \(A_1\) above and let \(A_1\) be of size \(n_1 \times n_1\).
Since \(A_1\) has the form above (Jordan block), then we know at least for \(\lambda_1\), \(T(v_1) = \lambda_1 v_1\). What about \(T(v_2)\)? We see that column 2 has 1 in the first row and then \(\lambda_2\) in the second row. So \(T(v_2) = v_1 + \lambda_2 v_2\). Therefore, \(v_2\) is not an eigenvector but we can observe that
So at this point, we see that \(v_2\) is not an eigenvector but if we apply the map \((T - \lambda_1 I_V)\) on it, it becomes an eigenvector. What if we apply this map twice?
What about the remaining vectors?
So they’re not eigenvectors but they satisfy these equations. Based on this observation, we’re going to define the following
Observation: When \(p\) is the smallest integer for which \((T - \lambda I_V)^p(x) = \bar{0}_V\), then \(y = (T - \lambda I_V)^{p-1}(x)\) is an eigenvector.
So now we know that the basis we want to build will consists of generalized eigenvectors. These generalized eigenvectors belongs to subspaces we define next
- \(K_{\lambda}\) is a \(T\)-invariant subspace of \(V\) containing \(E_{\lambda}\).
- For \(\mu \neq \lambda\), the restriction of \(T - \mu I_V\) to \(K_{\lambda}\) is one-to-one.
Proof
(a) We need to show three things. \(K_{\lambda}\) contains \(E_{\lambda}\). This is clearly true by definition and \(E_{\lambda} \subseteq K_{\lambda}\).
Next we need to show that \(K_{\lambda}\) is a subspace. This means we need to show that it contains the zero vector and it is closed under scalar multiplication and addition. \((T - \lambda I_V)(\bar{0}_V) = \bar{0}_V\) so \(\bar{0}_V \in K_{\lambda}\). Now consider \(x, y \in K_{\lambda}\) and \(c \in \mathbf{F}\), we need to show that \(x + cy \in K_{\lambda}\). Since \(x\) and \(y\) are in \(K_{\lambda}\), then
Therefore
Next, we need to show that \(K_{\lambda}\) is \(T\)-invariant. This means that we want to show that \(T(K_{\lambda}) \subseteq K_{\lambda}\). Therefore, Let \(x \in K_{\lambda}\). We want to show that \(T(x) \in K_{\lambda}\). Since \(x \in K_{\lambda}\), then
Apply the linear map \(T\) to both sides
\(T\) and \((T - \lambda I_V)^{p}\) comute. Why? Taking \((T - \lambda I_V)\) to a power expands to some form of \(\lambda^k T^l\). So
So we see above that \(T(x)\) belongs to \(K_{\lambda}\) as we wanted to show.
(b) Next, we want to prove that the restriction of \(T - \mu I_V\) to \(K_{\lambda}\) is one-to-one. So we want to think of this map \(T - \mu I_V\) as map on \(K_{\lambda}\) but what is the right target? We know it maps to \(V\) but should we consider another target like \(K_{\lambda}\)? Let’s look at the image of this map when it acts on a vector in \(K_{\lambda}\)
We know that \(K_{\lambda}\) is \(T\)-invariant so \(T(x) \in K_{\lambda}\). What about \(\mu x\)? This is just a multiply of \(x\) and since \(K_{\lambda}\) is a subspace then we know that \(\mu x \in K_{\lambda}\). Therefore the addition of the two terms is also in \(K_{\lambda}\) since \(K_{\lambda}\) is a subspace. So this tells us that the target we want to consider is \(K_{\lambda}\).
So now we want to prove that this map \(T - \mu I_V: K_{\lambda} \rightarrow K_{\lambda}\) is one to one. One way to show this is to prove that the nullspace of this map is trivial. This means that the solution to
is the trivial solution where \(x = \bar{0}_V\). Suppose for the sake of contradiction that this isn’t true and \(x \neq \bar{0}_V\) but \((T - \mu I_V)(x) = \bar{0}_V\).
However, we know that \(x \in K_{\lambda}\) so it must be killed by some power of the operator so let \(p\) be the smallest integer such that
But if we take the power just below \(p\), then
This implies that \(y\) is an eigenvector and \(y \in E_{\lambda}\). Observe next what happens when we apply the map \((T - \mu I_V)\) on the eigenvector \(y\)
So we’ve shown that \(y\) is an eigenvector for an eigenvalue \(\mu\). So \(y \neq \bar{0}_V\) and \(y \in E_{\mu}\). But we also see that \(y \in E_{\lambda}\). However \(\lambda \neq \mu\). So \(y \in E_{\mu} \cap E_{\lambda}\). So this is a contradiction.
Finding the Generalized Eigenvectors
So to remind ourselves, the goal of this whole process is to find a basis consisting of generalized eigenvectors. The next theorem makes it practically easier to find them.
This makes finding a basis for \(K_{\lambda}\) simple because it’s just a matter of finding the nullspace like we did before by putting the matrix in echelon form. What’s next? We want these generalized eigenvectors to span \(V\) since we want a basis. The following theorem confirms it.
The next thing that we need is obviously knowing that these generalized eigenvectors are linearly independent. Once we get that we can construct the basis that we want.
- \(\beta_i \cap \beta_j = \emptyset\) for \(i \neq j\)
- \(\beta = \beta_1 \cup ... \cup \beta_k\) is a basis for \(V\)
- \(\dim(K_{\lambda_j}) = \) algebraic multiplicity of \(\lambda_j \)
Proof
(a): Assume for the sake of contradiction that \(\beta_i \cap \beta_j \neq \emptyset\). Then there exists \(x \in \beta_i \cap \beta_j\). We know that \(\beta_i \cap \beta_j \subseteq K_{\lambda_i} \cap K_{\lambda_j}\). Since \(i \neq j\), then \(\lambda_i \neq \lambda_j\). Therefore by Theorem 1.1(b), the restriction of \((T - \lambda_i I_V)\) to \(K_{\lambda_j}\) is 1-1.
Since it’s one-to-one, then its nullspace is just the zero vector. This also means no other non-zero vector will make \((T - \lambda_i I_V)(w)\) zero for any \(w \in K_{\lambda_j}\). So pick \(x \in K_{\lambda_i} \cap K_{\lambda_j}\). Then
So we won’t get a zero no matter since the map is 1-1. But \(x\) is also in \(K_{\lambda_i}\). This implies that \(x \not\in K_{\lambda_i}\). This is a contradiction. So the intersection is empty as we wanted to show.
References
- Math416 by Ely Kerman