Menu
QuantML.orgQuantML.org
  • Linear Algebra
  • Diagonalization of a Matrix

    Diagonalization of a Matrix

    Say that we have a
    \(n\times n\)
    matrix
    \(A\)
    with
    \(n\)
    independent eigenvectors (say
    \(\vec{x}_1,\vec{x}_2,\cdots,\vec{x}_n\)
    ).
    We put these eigenvectors (
    \(\vec{x}_1,\vec{x}_2,\cdots,\vec{x}_n\)
    ) in a matrix (say
    \(S\)
    ).
    \[ S=\begin{bmatrix} \vdots & \vdots & \cdots & \vdots \\ \vec{x}_{1} & \vec{x}_{2} & \cdots & \vec{x}_{n} \\ \vdots & \vdots & \cdots & \vdots \\ \end{bmatrix} \]
    \[A\vec{x}_i=\lambda_i\vec{x}_i;\quad \left\{\begin{matrix} \vec{x}_i\text{ is eigenvector}\\ \lambda_i\text{ is eigenvalue} \end{matrix}\right. \]
    Let's calculate
    \(AS\)

    \(AS = A \begin{bmatrix} \vdots & \vdots & \cdots & \vdots \\ \vec{x}_{1} & \vec{x}_{2} & \cdots & \vec{x}_{n} \\ \vdots & \vdots & \cdots & \vdots \\ \end{bmatrix}\)

    \(\Rightarrow AS = \begin{bmatrix} \vdots & \vdots & \cdots & \vdots \\ A\vec{x}_{1} & A\vec{x}_{2} & \cdots & A\vec{x}_{n} \\ \vdots & \vdots & \cdots & \vdots \\ \end{bmatrix}\)

    \(\Rightarrow AS = \begin{bmatrix} \vdots & \vdots & \cdots & \vdots \\ \lambda_1\vec{x}_{1} & \lambda_2\vec{x}_{2} & \cdots & \lambda_n\vec{x}_{n} \\ \vdots & \vdots & \cdots & \vdots \\ \end{bmatrix}\)

    \[\Rightarrow AS = \begin{bmatrix} \vdots & \vdots & \cdots & \vdots \\ \vec{x}_{1} & \vec{x}_{2} & \cdots & \vec{x}_{n} \\ \vdots & \vdots & \cdots & \vdots \\ \end{bmatrix} \underbrace{\begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \\ \end{bmatrix}}_{\text{say }\Lambda} \]
    \(AS=S\Lambda\)

    because
    \(S\)
    has independent columns so
    \(S^{-1}\)
    exists, so
    \[ \begin{matrix} \displaystyle \Lambda =S^{-1}AS \end{matrix} \]
    This is Diagonalization.
    We can also get a factorization of
    \(A\)

    \[ \begin{matrix} \displaystyle A=S\Lambda S^{-1} \end{matrix} \]


    Power of a matrix

    Now let's see how
    \(A=S\Lambda S^{-1}\)
    helps us in calculating
    \(A^k\)
    for
    \(k=\{1,2,3,\cdots\}\)

    We know that,
    \[A\vec{x}_i=\lambda_i\vec{x}_i;\quad \left\{\begin{matrix} \vec{x}_i\text{ is eigenvector}\\ \lambda_i\text{ is eigenvalue} \end{matrix}\right. \]
    Now let's calculate eigenvectors and eigenvalues of
    \(A^2\)
    .
    \(AA\vec{x}_i=\lambda_i (A \vec{x}_i)\)

    \(\Rightarrow A^2\vec{x}_i=\lambda_i^2\vec{x}_i\)

    So eigenvectors of
    \(A^2\)
    remains same.
    But eigenvalues of
    \(A^2\)
    becomes
    \(\lambda^2\)
    .

    We can also get it by
    \(A=S\Lambda S^{-1}\)
    .
    \(A=S\Lambda S^{-1}\)

    \(\Rightarrow A^2=S\Lambda (S^{-1}S)\Lambda S^{-1}\)

    \(\Rightarrow A^2=S\Lambda \mathcal{I}\Lambda S^{-1}\)

    \(\Rightarrow A^2=S\Lambda \Lambda S^{-1}\)

    \(\Rightarrow A^2=S\Lambda^2 S^{-1}\)

    where,
    \(\Lambda^2=\begin{bmatrix} \lambda_1^2 & 0 & \cdots & 0 \\ 0 & \lambda_2^2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n^2 \\ \end{bmatrix}\)
    Now we can get formula for
    \(k^{\text{th}}\)
    power of
    \(A\)
    .
    \[ \begin{matrix} \displaystyle A^k = S\Lambda^k S^{-1} \end{matrix} \]

    So eigenvectors of
    \(A^k\)
    remains same as the eigenvectors of
    \(A\)
    .
    But eigenvalues of
    \(A^k\)
    becomes
    \(\lambda^k\)
    .
    So now we can find power of matrices quickly, if we have the eigenvectors(independent) of that matrix.
    When we call a matrix
    \(A\)
    stable?
    A matrix
    \(A\)
    is stable if
    \(A^k\to 0\)
    as
    \(k\to\infty\)
    .
    And
    \(A^k\to 0\)
    as
    \(k\to\infty\)
    if,
    \(|\lambda_i| \lt 1;\quad\)
    \(\forall\ i\in\{1,2,\cdots n\}\)
    When a matrix
    \(A\)
    is Diagonalizable?
    Diagonalizable matrix has
    \(n\)
    independent eigenvectors.
    So
    \(A\)
    is diagonalizable if, all
    \(\lambda\)
    's are different.
    \(\lambda_i\neq\lambda_j;\quad \forall \ i\neq j\)

    If all the eigenvalues are different then all the eigenvectors are independent.

    But reverse is not true, independence of eigenvectors does not implies that all eigenvalues are different.
    Example
    Say
    \(A=\mathcal{I}_{3\times 3}=\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}\)

    So when we apply transformation of
    \(\mathcal{I}\)
    nothing changes so every vector is an eigenvector for
    \(\mathcal{I}\)
    .
    But there is only one eigenvalue and that is
    \(1\)
    .


    \(\vec{u}_{k+1}=A\vec{u}_k\)

    Say we have a vector
    \(\vec{u}_0\)
    and a matrix(non-singular)
    \(A\)
    , with
    \(n\)
    independent eigenvectors.
    We iteratively apply transformation
    \(A\)
    on
    \(\vec{u}_0\)
    .
    \[\vec{u}_{k+1}=A\vec{u}_k\]
    \(\vec{u}_{1}=A\vec{u}_0\)

    \(\vec{u}_2=A^2\vec{u}_0\)

    \(\vec{u}_k=A^k\vec{u}_0\)


    How can we find
    \(\vec{u}_k\)
    for an arbitrary large
    \(k\)
    .

    \(\vec{u}\)
    lives in the column space of
    \(A\)
    , and
    \(A\)
    has
    \(n\)
    independent eigenvectors so we can say that
    \(\vec{u}\)
    lives in the vector space spanned by these eigenvectors.
    so
    \(\vec{u}\)
    is the linear combination of these eigenvectors.
    say the eigenvectors of
    \(A\)
    are
    \(\vec{x}_1,\vec{x}_2,\cdots,\vec{x}_n\)
    and eigenvalues of
    \(A\)
    are
    \(\lambda_1,\lambda_2,\cdots,\lambda_n\)

    So we can say that,
    \(\vec{u}_0=c_1\vec{x}_1+c_2\vec{x}_2+\cdots+c_n\vec{x}_n\)

    \(\Rightarrow A\vec{u}_0=c_1A\vec{x}_1+c_2A\vec{x}_2+\cdots+c_nA\vec{x}_n\)

    \(\Rightarrow A\vec{u}_0=c_1\lambda\vec{x}_1+c_2\lambda\vec{x}_2+\cdots+c_n\lambda\vec{x}_n\)

    \(\Rightarrow A^{k}\vec{u}_0=c_1\lambda^{k}\vec{x}_1+c_2\lambda^{k}\vec{x}_2+\cdots+c_n\lambda^{k}\vec{x}_n\)

    \(\Rightarrow A^{k}\vec{u}_0=\Lambda^k S \vec{c}\)

    \(\Lambda=\begin{bmatrix} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \\ \end{bmatrix};\quad\)
    \(S=\begin{bmatrix} \vdots & \vdots & \cdots & \vdots \\ \vec{x}_{1} & \vec{x}_{2} & \cdots & \vec{x}_{n} \\ \vdots & \vdots & \cdots & \vdots \\ \end{bmatrix};\quad\)
    \(\vec{c}=\begin{bmatrix} c_1\\c_2\\ \vdots \\c_n \end{bmatrix}\)

    So,
    \[ \begin{matrix} \displaystyle \vec{u}_k = A^{k}\vec{u}_0=\Lambda^k S\ \vec{c} \end{matrix} \]


    Fibonacci example

    Fibonacci sequence:
    \(0,1,1,2,3,5,\cdots\)

    How can we find
    \(F_k\)
    for an arbitrary large
    \(k\)
    .
    \(F_{k+2}=F_{k+1}+F_k\)

    Now let's write it in form of
    \(\vec{u}_{k+1}=A\vec{u}_k\)

    say
    \(\vec{u}_k= \begin{bmatrix} F_{k+1}\\F_k\\ \end{bmatrix}\)

    and
    \(\vec{u}_{k+1}=A\vec{u}_k\)

    \(\Rightarrow \vec{u}_{k+1}=A\begin{bmatrix} F_{k+1}\\F_k\\ \end{bmatrix}\)

    \(\Rightarrow A=\begin{bmatrix} 1&1\\1&0\\ \end{bmatrix}\)

    What are it's eigenvalues and eigenvectors.
    \(\text{det}(A - \lambda\mathcal{I})=\begin{vmatrix} 1-\lambda & 1 \\ 1 & -\lambda \\ \end{vmatrix}\)

    \(\text{det}(A - \lambda\mathcal{I})=\lambda^2-\lambda-1\)

    for
    \(\text{det}(A - \lambda\mathcal{I})=0\)
    we get,

    \(\lambda_1 = \frac{1+\sqrt{5}}{2}\)


    \(\lambda_2 = \frac{1-\sqrt{5}}{2}\)


    We get our eigenvalues
    \(\lambda_1, \lambda_2\)
    .
    \((A-\lambda\mathcal{I})\vec{x}=0\)

    \(\begin{bmatrix} 1-\lambda & 1 \\ 1 & -\lambda \\ \end{bmatrix} \vec{x}=0 \)

    by solving it we get,
    \(\vec{x}=\begin{bmatrix} \lambda\\ 1 \end{bmatrix}\)

    So our eigenvectors are
    \(\vec{x}_1=\begin{bmatrix} \lambda_1\\ 1 \end{bmatrix},\quad\)
    \(\vec{x}_2=\begin{bmatrix} \lambda_2\\ 1 \end{bmatrix}\)


    Now we get our eigenvectors and eigenvalues.
    \(\vec{u}\)
    lives in the vector space spanned by these eigenvectors.
    so
    \(\vec{u}\)
    is the linear combination of these eigenvectors.
    we found the eigenvectors of
    \(A\)
    \(\vec{x}_1,\vec{x}_2\)
    and eigenvalues of
    \(A\)
    are
    \(\lambda_1,\lambda_2\)

    So we can say that,
    \(\vec{u}_0=c_1\vec{x}_1+c_2\vec{x}_2\)

    \(\Rightarrow A\vec{u}_0=c_1A\vec{x}_1+c_2A\vec{x}_2\)

    \(\Rightarrow A\vec{u}_0=c_1\lambda_1\vec{x}_1+c_2\lambda_2\vec{x}_2\)

    So,
    \[\vec{u}_k=A^{k}\vec{u}_0=c_1\left(\frac{1+\sqrt{5}}{2}\right)^{k}\vec{x}_1+c_2\left(\frac{1-\sqrt{5}}{2}\right)^{k}\vec{x}_2\]

    Now let's find
    \(c_1,c_2\)
    .
    we know
    \(\vec{u}_0=\begin{bmatrix} F_{1}\\F_0\\ \end{bmatrix}\)

    \(\Rightarrow \vec{u}_0=\begin{bmatrix} 1\\0\\ \end{bmatrix}\)

    And we know that
    \(\vec{u}_0=c_1\vec{x}_1+c_2\vec{x}_2\)
    so,
    \(\vec{u}_0=c_1\begin{bmatrix} \lambda_1\\ 1 \end{bmatrix}+c_2\begin{bmatrix} \lambda_2\\ 1 \end{bmatrix}\)

    \(\Rightarrow c_1\begin{bmatrix} \lambda_1\\ 1 \end{bmatrix}+c_2\begin{bmatrix} \lambda_2\\ 1 \end{bmatrix}=\begin{bmatrix} 1\\0\\ \end{bmatrix}\)

    Now we got system of equations,
    \(c_1\lambda_1 + c_2\lambda_2 = 1\)

    \(c_1 + c_2 = 0\)

    by solving we get,
    \(c_1=\frac{1}{\lambda_1-\lambda_2}\)
    and
    \(c_2=\frac{-1}{\lambda_1-\lambda_2}\)

    \(\lambda_1-\lambda_2 = \frac{1+\sqrt{5}}{2} - \frac{1-\sqrt{5}}{2}=\sqrt{5}\)
    we know that
    \(\vec{u}_k = A^k\vec{u}_0=c_1\lambda_1^k\vec{x}_1+c_2\lambda_2^k\vec{x}_2\)

    \(\Rightarrow \vec{u}_k =\frac{1}{\lambda_1-\lambda_2}\left(\lambda_1^k\vec{x}_1-\lambda_2^k\vec{x}_2\right)\)

    \(\Rightarrow \vec{u}_k =\frac{1}{\lambda_1-\lambda_2}\left( \lambda_1^k\begin{bmatrix} \lambda_1\\ 1 \end{bmatrix}- \lambda_2^k\begin{bmatrix} \lambda_2\\ 1 \end{bmatrix}\right)\)

    \(\Rightarrow \vec{u}_k =\frac{1}{\lambda_1-\lambda_2}\left( \begin{bmatrix} \lambda_1^{k+1}\\ \lambda_1^k \end{bmatrix}- \begin{bmatrix} \lambda_2^{k+1}\\ \lambda_2^k \end{bmatrix}\right)\)

    We know that,
    \(\vec{u}_k= \begin{bmatrix} F_{k+1}\\F_k\\ \end{bmatrix}\)

    So,
    \[F_k=\frac{\lambda_1^k - \lambda_2^k}{\lambda_1-\lambda_2}\]
    \(\lambda_1 = \frac{1+\sqrt{5}}{2}\)
    and
    \(\lambda_2 = \frac{1-\sqrt{5}}{2}\)
    so
    \[ \begin{matrix} \displaystyle F_k=\frac{1}{\sqrt{5}}\left( \left(\frac{1+\sqrt{5}}{2}\right)^k - \left(\frac{1-\sqrt{5}}{2}\right)^k \right) \end{matrix} \]
    For large
    \(k,\quad\)
    \(\left(\frac{1-\sqrt{5}}{2}\right)^{k}\approx 0\)

    So
    \[F_k\approx \frac{1}{\sqrt{5}}\left( \left(\frac{1+\sqrt{5}}{2}\right)^k \right)\]
    \(\frac{1+\sqrt{5}}{2}\)
    is known as Golden Ratio