Motivation

Definition
\(A \in M_{n \times n}\) is diagonal if all entries off diagonal are 0. \(A_{ij} = 0\) if \(i \neq j\).


All computations involving diagonal matrices are simple. If \(A, B\) are diagonal, then

$$ \begin{align*} \det(A) = A_{11}...A_{nn} \end{align*} $$

Similarly, matrix multiplication of diagonal matrices is simple. The \(AB_{ij}\)th entry is

$$ \begin{align*} AB_{ij} &= \begin{cases} A_{ij}B_{ij} \quad \text{if } i = j \\ 0\phantom{A_{ij}B} \quad \text{if } i \neq j \end{cases} \end{align*} $$

This can be generalized to computing \((A)^k\) where the \(ij\) entry is

$$ \begin{align*} (A^k)_{ij} &= \begin{cases} (A_{ij})^k \quad \text{if } i = j \\ 0\phantom{(A_{ij})} \quad \text{if } i \neq j \end{cases} \end{align*} $$

This leads to the question of whether we can transform any matrix to a diagonal matrix so we can perform these computations easily. In the next definition we formalize this.



When is \(A\) Diagonalizable?

Theorem
\(T \in V \rightarrow V\) is diagonalizable if there is a basis \(\beta = \{v_1,...,v_n\}\) of \(V\) such that $$ \begin{align*} [T]_{\beta}^{\beta} = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix}. \end{align*} $$


Two questions arises from this definition:
Questions 1: Does such a basis exist?
Question 2: If it exists, how can we compute it?

Suppose we have a basis \(\beta = \{v_1,...,v_n\}\) such that

$$ [T]_{\beta}^{\beta} = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix} $$


This is equivalent to

  • \(\leftrightarrow\) This matrix above by defintion is
    $$ [T]_{\beta}^{\beta} = \begin{pmatrix} [T(v_1)]_{\beta} & \cdots & [T(v_n)]_{\beta} \end{pmatrix} $$
  • \(\leftrightarrow\) We can factor \(\lambda\) to see that \([T(v_j)]_{\beta} = \lambda_j \begin{pmatrix} 0 & \cdots & 1 & \cdots & 0 \end{pmatrix}^t \). But this is just the \(j\)th vector of the standard basis so we can write it as \(\lambda_j[v_j]_{\beta}\). [TODO: why?].
  • \(\leftrightarrow\) We can take the constant \(\lambda_j\) inside since \([ \quad ]_{\beta}\) is a linear map to see that
    $$ [T(v_j)]_{\beta} = \lambda_j[v_j]_{\beta} = [\lambda_jv_j]_{\beta} \text{ for } j = 1,...,n. $$
  • \(\leftrightarrow \). Because both sides of the equation are written with respect to basis \(\beta\), we can take it out and write
    $$ T(v_j) = \lambda_jv_j \text { for } j = 1,...,n \text { and } \lambda_1,...,\lambda_n \in \mathbf{R} $$
  • \(\leftrightarrow T(v) = \lambda v\).
  • \(\leftrightarrow T(v) = \lambda I_V(v)\). (Since the identity matrix does nothing)
  • \(\leftrightarrow T(v) - \lambda I_V(v) = \bar{0}_V\).
  • \(\leftrightarrow (T - \lambda I_V)(v) = \bar{0}_V\).

The left hand side is a family of linear maps parameterized by \(\lambda\). The solution to this is the set of all the non-zero vectors \(v\) of the nullspace. We don’t care about the zero solution since we want to build a basis.



Eigenvectors and Eigenvalues of a Linear Transformation

So now we’ve seen that finding such a basis boils down to finding all vectors such that \(T(v) = \lambda v\). These vectors are called eigenvectors. More formally,

Definition
An eigenvector of \(\ T \in V \rightarrow V\) is a \(v \neq \bar{0}_V\) such that $$ \begin{align*} T(v) = \lambda v \end{align*} $$


And the \(\lambda\)’s are called,

Definition
\(\lambda \in \mathbf{R}\) is an eigenvalue of \(\ T \in V \rightarrow V\) if \(\exists v \neq \bar{0}_V\) such that \(T(v) = \lambda v\).


We can now restate the previous theorem as the following,

Theorem
\(T \in V \rightarrow V\) is diagonalizable if and only if there is a basis \(\beta = \{v_1,...,v_n\}\) consisting of eigenvectors.


As we’ve seen before, finding a basis \(\beta\) where

$$ \begin{align*} [T]_{\beta}^{\beta} = \begin{pmatrix} \lambda_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \lambda_n \end{pmatrix}, \lambda_1,...,\lambda_n \text{ eigenvalues} \end{align*} $$

is equivalent to find the set of eigenvectors that satisfy \(T(v) = \lambda v\). This is all great. But now instead of looking at general linear maps that satisfy these conditions, let’s turn our focus on matrices.



Eigenvectors and Eigenvalues of Matrices

When is a matrix diagonalizable? and what are the eigenvectors and eigenvalues of a given matrix \(A\)?

Definition
\(A \in M_{n \times n}\) is diagonalizable if \(L_A\) is diagonalizable.


This is equivalent to “There is a \(Q \in M_{n \times n}\) such that \(Q^{-1}AQ\) is diagonal”.

Definition
An eigenvector of \(A \in M_{n \times n}\) is a \(v \neq \bar{0} \in \mathbf{R}^n\) such that \(Av = \lambda v\).


and finally,

Definition
\(\lambda \in \mathbf{R}\) is an eigenvalue of \(A \in M_{n \times n}\) if \(\exists v \neq \bar{0}_V\) such that \(Av = \lambda v\).





Finding the Eigenvectors

Okay now that we’ve narrowed down the discussion to matrices, how do we actually find these eigenvectors of \(A\)? Set the nullspace of \(A\) to \(N(A) = N(L_A)\). Next we will need the following lemma

Lemma
\(v \in \mathbf{R}^n\) is an eigenvector of \(A\) with eigenvalue \(\lambda\) if and only if \(v \in N(A - \lambda I_n)\).


Proof:

Suppose \(v\) is an eigenvector of (A). By definition this means that \(Av = \lambda v\). We can re-write this as,

$$ \begin{align*} Av &= \lambda I_n v \\ Av - \lambda I_n v &= \bar{0} \end{align*} $$

But this precisely means that \(v \in N(A - \lambda I_n). \ \blacksquare\)



Example

Find all the eigen values of

$$ \begin{align*} A = \begin{pmatrix} 0 & 0 & 1 \\ 0 & 2 & 0 \\ -2 & 0 & 3 \end{pmatrix} \end{align*} $$

with eigenvalue 1.

By the lemma, we want all vectors in the null space of \(A - \lambda I_n\).

$$ \begin{align*} A - (1)I_n = \begin{pmatrix} -1 & 0 & 1 \\ 0 & 1 & 0 \\ -2 & 0 & 2 \end{pmatrix} \end{align*} $$

We’ll put this matrix in row echelon form.

$$ \begin{align*} \begin{pmatrix} -1 & 0 & 1 \\ 0 & 1 & 0 \\ -2 & 0 & 2 \end{pmatrix} R_3 \rightarrow -2R_1 + R_3, R_1 \leftrightarrow -1R_1 \begin{pmatrix} 1 & 0 & -1 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix} \end{align*} $$

From this we see that the null space consists of vectors of the form

$$ \begin{align*} &= \{(x_1, x_2, x_3) \ | \ x_3 = t, x_1 = t, x_2 = 0\}. \\ &= \{(t, 0, t) \ | \ t \in \mathbf{R} \} \\ &= span\{ (1,0,1) \} \end{align*} $$

This is easy because we are given the eigenvalue. But typically, we also need to find the eigenvalues too!



Eigenspace

Definition
If \(\lambda\) is an eigenvalue of \(A\), then the eigenspace of \(A\) corresponding to \(\lambda\) is $$ \begin{align*} E_{\lambda} &= N(A - \lambda I_n) \\ &= \{ \text{eigenvectors for } \lambda \} \cup \{\bar{0}\} \end{align*} $$


[TODO: What is the difference between the eigenspace and the nullspace?]



Finding the Eigenvalues

Again, if we’re given an eigenvalue, then finding the eigenspace or eigenvectors is easy and simple. We’re just solving a system of linear equations like we did for finding the nullspace. The question is how can we find the eigenvalues? for this we need the following theorem

Theorem 5.2
\(\lambda\) is an eigenvalue of \(A\) if and only if \(\det(A - \lambda I_n) = 0\).


Proof:

\(\lambda\) is an eigenvalue is equivalent to

$$ \begin{align*} &\leftrightarrow \exists v \neq \bar{0} \text{ such that } Av = \lambda v \\ &\leftrightarrow (A - \lambda I_n)(v) = \bar{0} \text{ for } v \neq 0\\ &\leftrightarrow N(A - \lambda I_n)(v) \neq \bar{0} \\ &\leftrightarrow A - \lambda I_n \text{ is not 1-1} \\ &\leftrightarrow A - \lambda I_n \text{ is not invertible} \\ &\leftrightarrow \det(A - \lambda I_n) = 0. \\ \end{align*} $$


So we see now that we have the necessary and sufficient conditions for \(\lambda\) to be an eigenvalue of \(A\). So what’s next? \(A\) is given to us in this equation but we need a \(\lambda\) that would make the equation \(\det(A - \lambda I_n)\) equal to zero. Let’s look at the following definition

Definition
\(f(t) = \det(A - tI_n)\) is the characteristic polynomial of \(A\).


What is this saying? we can interpret the right hand side as a function. We’re given \(A\). We know the identity matrix. So the unknown is \(t\). So inside the determinant, we’ll have a matrix with entries that depend on \(A\) and \(t\). We know the determinant is a map / inductive formula. So this expression when expanded as a whole is some number that depends on it. In fact it shouldn’t be surprising that \(f(t)\) is a polynomial of degree \(n\) (FACT).

Based on this, we can rephrase the previous theorem as the following corollary,

Corollary 1
\(\lambda\) is an eigenvalue of \(A\) if and only if \(f(t) = \det(A - \lambda I_n) = 0\).


So eigenvalues are the roots of this polynomial so it’s not always easy to do. How many roots? We know the degree of \(f(t)\) is at most \(n\). Therefore,

Corollary 2 (Theorem 5.3(b))
\(A\) has at most \(n\) eigenvalues.


So we know at least that there can only be \(n\) roots/eigenvalues at most.



Example

Find the eigenvalues of \(\begin{align*} A = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}. \end{align*}\)

Let’s write the characteristic polynomial and find its roots so

$$ \begin{align*} f(t) = \det(A - tI_n) &= 0 \\ \det \begin{pmatrix} -t & -1 \\ 1 & -t \end{pmatrix} &= 0 \\ t^2 + 1 &= 0 \\ t^2 &= -1 \\ \end{align*} $$

This polynomial has no real roots! and so the matrix \(A\) has no eigenvalues.



References

  • Math416 by Ely Kerman