Consider a linear map \(T: V \rightarrow V\). Suppose \(v\) is an eigenvector of \(T\). We know by definition that \(Tv = \lambda v\) and \(v \neq 0\). We also know that \(span\{v\} \subset V\) is subspace. The observation here is that since \(v\) is an eigenvector, then when \(T\) acts on the subspace, then it stays inside the subspace meaning

$$ \begin{align*} T(span\{v\}) \subset span\{v\} \end{align*} $$

This is because \(T(cv) = cT(v) = (c\lambda) v\) which is in the span of \(v\). The span of an eigenvector is the simplest example an invariant subspace.

Definition
A subspace \(W \in V\) is \(T\)-invariant if \(T(W) \subseteq W\).


i.e. \(T(w) \in W \ \forall w \in W\)

Examples

Example 1: \(W = span\{v\}\) where \(v\) is an eigenvector of \(T\).

Example 2: \(I_V:V \rightarrow V\) Every subspace is \(I_V\)-invariant.

Example 3: \(0_V:V \rightarrow V\) Every subspace is \(0_V\)-invariant.

Example 4:

$$ \begin{align*} T : \ &\mathbf{R}^2 \rightarrow \mathbf{R}^2 \\ &(x, y) \rightarrow (x,0) \end{align*} $$

\(W = span\{(1,0)\}\) is \(T\)-invariant
\(W = span\{(1,1)\}\) is not \(T\)-invariant



The Characteristic Polynomial of Invariant Subspaces

So what is the point of invariant subspaces? It helps us break off pieces of our map. What does that mean? If \(T: V \rightarrow V\) and \(W\) is \(T\)-invariant, then the restriction of \(T\) to \(W\), \(T_W\) satisfies \(T_W: W \rightarrow W\). So now have a smaller set instead of the entire vector space \(V\). The following theorem describes the relationship between the characteristic polynomial of \(T:V \rightarrow V\) and the characteristic polynomial of \(T: W \rightarrow W\).

Theorem
Suppose \(T: V \rightarrow V\) is linear and \(\dim(V) < \infty\). If \(W\) is \(T\)-invariant, then the characteristic polynomial of \(T_W\) divides that of \(T\).



Proof

Let \(\beta_w = \{v_1,...,v_k\}\) be a basis of \(W\). Extend this to a basis for \(V\). so

$$ \begin{align*} \beta = \{v_1,...,v_k,v_{k+1},...,v_n\} \end{align*} $$

We can express \(T\) with respect \(\beta\) to get

$$ \begin{align*} [T]_{\beta}^{\beta} &= \begin{pmatrix} [T(v_1)]_{\beta_1} & \cdots & [T(v_n)]_{\beta_1} \end{pmatrix} \end{align*} $$

We know that \(T(v_i) \in W\) for \(i = 1,...,k\) since \(W\) is \(T\)-invariant. Therefore, we can express \(T(v_i)\) as a linear combination of just the first \(k\) vectors in \(\beta\). The rest of the coefficients will be zero for the remaining vectors in \(\beta\). So you can imagine below that for the first \(k\) columns of \([T]_{\beta}^{\beta}\), we’ll have zero entries for anything below the \(k\)th entry. Let \(B_1\) be the coefficients that we know will not be zero (at least some).

$$ \begin{align*} [T]_{\beta}^{\beta} &= \begin{pmatrix} B_1 & | & B_2 \\ ---- & - & ---- \\ 0 & | & B_3 \\ \end{pmatrix} \end{align*} $$

We claim that \(B_1 = [T]_{\beta_W}^{\beta_W}\). Now, we want to determine the characteristic polynomial of \([T]_{\beta}^{\beta}\).

$$ \begin{align*} \det([T]_{\beta}^{\beta} - tI_n) &= \begin{pmatrix} [T]_{\beta_W}^{\beta_W} - tI_k & | & B_2 \\ ---- & - & ---- \\ 0 & | & B_3 - tI_{n-k} \\ \end{pmatrix} \end{align*} $$

This is a block upper triangular matrix. The determinant has a nice form for such matrices and we can write

$$ \begin{align*} \det([T]_{\beta}^{\beta} - tI_n) &= \begin{pmatrix} [T]_{\beta_W}^{\beta_W} - tI_k & | & B_2 \\ ---- & - & ---- \\ 0 & | & B_3 - tI_{n-k} \\ \end{pmatrix} \\ &= \det([T]_{\beta_W}^{\beta_W} - tI_k)g(t) \end{align*} $$

From this we see that the characteristic polynomial of \([T]^{\beta_W}_{\beta_W}\) divides the characteristic polynomial of \([T]_{\beta}^{\beta} \ \blacksquare\)



T-cyclic Subspaces

Since \(T\)-invariant subspaces are useful, the question is can we produce them? Is there a tool or mechanism to find them? We start with the following definition

Definition
For \(v \in V\) the \(T\)-cyclic subspace generated by \(v\) is \(W=span\{v, T(v), T^2(v),...\} \subseteq V\).


Observe here that \(W\) is \(T\)-invariant. Why? take any element \(w \in W\), \(T(w)\) is still in \(W\). To see this, notice that

$$ \begin{align*} T(a_0v + a_1T(v) + ... + a_kT^k(v)) = a_0T(v) + a_1T^2(v)+...+a_kT^{k+1}(v) \in W \end{align*} $$

Question: Are all \(T\)-invariant subspaces \(T\)-cyclic?

The answer is no. Suppose

$$ \begin{align*} T \ &: \mathbf{R}^3 \rightarrow \mathbf{R}^3 \\ &(x,y,z) \rightarrow (x,y,0) \end{align*} $$

\(W = \{(x,y,0) \ | \ x, y \in \mathbf{R}\}\) is \(T\)-invariant. We just map it to itself. In fact \(T_W = I_W\).

So we see here that \(W\) is not \(T\)-cyclic. Take \((x, y, 0)\),

$$ \begin{align*} span\{(x,y,0), T(x,y,0),...\} = span\{(x,y,0)\} \end{align*} $$


Theorem
Suppose \(T: V \rightarrow V\) is linear and \(\dim(V) < \infty\). Let \(W\) be a \(T\)-cyclic subspace generated by \(v\). Set \(\dim W = k \leq \dim V\). Then
  • \(\{v,T(v),...,T^{k-1}(v)\}\) is a basis for \(W\)
  • If \(T^k(v) = a_0v + ... + a_{k-1}T^{k-1}(v)\), then the characteristic polynomial of \(T_W\) \((-1)^{k+1}(a_0 + a_1t + ... + a_{k-1}t^{k-1} - t^k)\)


For (a). Since the dimension is not infinite anymore. Then it’s natural to ask if only \(k\) of the infinitely generated vectors is a span for \(W\) and the answer is yes.

Proof:

We’ll start with a proof of (b) given (a). To say anything about the characteristic polynomial, we need to find a basis and then compute the matrix with respect to the basis. A natural choice is the basis given to us in \(a\) so

$$ \begin{align*} \beta_W = \{v,T(v),...,T^{k-1}(v)\} \end{align*} $$

Next we need to compute \([T_W]_{\beta_W}^{\beta_W}\)

$$ \begin{align*} [T_W]_{\beta_W}^{\beta_W} &= \begin{pmatrix} [T_W(v)]_{\beta_W} & \cdots & [T_W(T^{k-1}(v))]_{\beta_W} \end{pmatrix}\\ &= \begin{pmatrix} 0 & 0 & \cdots & \cdots & 0 & a_0 \\ 1 & 0 & \cdots & \cdots & 0 & a_1 \\ 0 & 1 & \ddots & \cdots & 0 & a_2 \\ \vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\ \vdots & \vdots & \ddots & \ddots & \vdots & a_{k-2} \\ 0 & 0 & \cdots & \cdots & 1 & a_{k-1} \\ \end{pmatrix} \end{align*} $$

The first column is the coefficients of \(T_W(v)\) with respect to \(\beta_W\) so that’s just 1 for \(T(v)\) while the rest are 0. Similarly, we have the same thing for the rest of the \(k-1\) column vectors. But for the last column, we need to represent \(T_W(T^{k-1}(v)) = T^k(v)\). We’re given \(T^k(v) = a_0v + ... + a_{k-1}T^{k-1}(v)\) and so the rest of the coefficients are \(a_0,a_1 .... a_{k-1}\).

Now, we can compute the determinant expanding across the first row

$$ \begin{align*} &\det([T_W]_{\beta_W}^{\beta_W} -tI_k) = \begin{pmatrix} -t & 0 & \cdots & \cdots & 0 & a_0 \\ 1 & -t & \cdots & \cdots & 0 & a_1 \\ 0 & 1 & \ddots & \cdots & 0 & a_2 \\ \vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\ \vdots & \vdots & \ddots & \ddots & \vdots & a_{k-2} \\ 0 & 0 & \cdots & \cdots & 1 & a_{k-1}-t \\ \end{pmatrix} \\ &= (-1)^{1+1}(-t)\det \begin{pmatrix} -t & \cdots & \cdots & 0 & a_1 \\ 1 & \ddots & \cdots & 0 & a_2 \\ \vdots & \ddots & \ddots & \vdots & \vdots \\ \vdots & \ddots & \ddots & \vdots & a_{k-2} \\ 0 & \cdots & \cdots & -t & a_{k-1}-t \\ \end{pmatrix} +(-1)^{1+k}a_0 (1) \end{align*} $$

The last determinant for the \(a_0\) component is 1 because that sub matrix is upper triangular so the determinant is the product of the entries on the diagonal which are all 1.

Next, we want to compute that new determinant but notice now that it has the same pattern so

$$ \begin{align*} &= -t \left( \det \begin{pmatrix} -t & \cdots & \cdots & 0 & a_1 \\ 1 & \ddots & \cdots & 0 & a_2 \\ \vdots & \ddots & \ddots & \vdots & \vdots \\ \vdots & \ddots & \ddots & \vdots & a_{k-2} \\ 0 & \cdots & \cdots & -t & a_{k-1}-t \\ \end{pmatrix} \right) +(-1)^{1+k}a_0 (1) \\ &= (-t) \left( (-t) \begin{pmatrix} -t & \cdots & \cdots & 0 & a_2 \\ 1 & \ddots & \cdots & 0 & a_3 \\ \vdots & \ddots & \ddots & \vdots & \vdots \\ 0 & \cdots & \ddots & -t & a_{k-2} \\ 0 & \cdots & \cdots & 1 & a_{k-1}-t \\ \end{pmatrix} + (-1)^k a_1 \right) +(-1)^{1+k}a_0 (1) \\ &= (-1)^2 t^2 \begin{pmatrix} -t & \cdots & \cdots & 0 & a_2 \\ 1 & \ddots & \cdots & 0 & a_3 \\ \vdots & \ddots & \ddots & \vdots & \vdots \\ 0 & \cdots & \ddots & -t & a_{k-2} \\ 0 & \cdots & \cdots & 1 & a_{k-1}-t \\ \end{pmatrix} + (-1)^{k+1} (ta_1 + a_0) \\ &= (-1)^{k+1}(a_0 + ta_1 + ... + t^{k-1}a_{k-1} - t^k) \end{align*} $$





References

  • Math416 by Ely Kerman