4.9.2 Answer

Part (A) First note that \(Q_{0}=0\) and \(Q_{1}=1\).

Let us define \(\beta \) as the event \(\left \{ visit\ state\ m\ before\ state\ 0\right \} \), then we write

\[ Q_{i}=\Pr \left ( \beta |X_{0}=i\right ) \]

But by conditioning on state of the chain at time \(1\) instead of time \(0\), we write1

\[ \Pr \left ( \beta |X_{0}=i\right ) ={\displaystyle \sum \limits _{k=0}^{k=m}} \Pr \left ( \beta |X_{1}=k\right ) \Pr \left ( X_{1}=k|X_{0}=i\right ) \]

But \(\Pr \left ( \beta |X_{1}=k\right ) =Q_{k}\) by definition, and \(\Pr \left ( X_{1}=k|X_{0}=i\right ) =P_{ik}\),Therefore the above becomes

\[ \fbox {$Q_{i}={\displaystyle \sum \limits _{k=0}^{k=m}} Q_{k}P_{ik}$}\]

Since \(Q_{k}=0\) and \(Q_{m}=1\), we can rewrite the above as follows

\begin{align*} Q_{i} & =0+P_{im}+{\displaystyle \sum \limits _{k=1}^{k=m-1}} Q_{k}P_{ik}\\ & =P_{im}+{\displaystyle \sum \limits _{k=1}^{k=m-1}} Q_{k}P_{ik}\end{align*}

If we examine the sum more closely, we see it is a product of a vector and a matrix. Since if we expand for few terms we see that

\begin{align*} Q_{1} & =P_{1,m}+\left ( Q_{1}P_{1,1}+Q_{2}P_{1,2}+\cdots +Q_{m-1}P_{1,m-1}\right ) \\ Q_{2} & =P_{2,m}+\left ( Q_{1}P_{2,1}+Q_{2}P_{2,2}+\cdots +Q_{m-1}P_{2,m-1}\right ) \\ & \cdots \\ Q_{m-1} & =P_{m-1,m}+\left ( Q_{1}P_{m-1,1}+Q_{2}P_{m-1,2}+\cdots +Q_{m-1}P_{m-1,m-1}\right ) \end{align*}

Which can be written as

\[\begin {bmatrix} Q_{1}\\ Q_{2}\\ \vdots \\ Q_{m-1}\end {bmatrix} =\begin {bmatrix} P_{1,m}\\ P_{2,m}\\ \vdots \\ P_{m-1,m}\end {bmatrix} +\begin {bmatrix} P_{1,1} & P_{1,2} & \cdots & P_{1,m-1}\\ P_{2,1} & P_{2,2} & \cdots & P_{2,m-1}\\ \cdots & \vdots & \vdots & \cdots \\ P_{m-1,m} & P_{m-1,2} & \cdots & P_{m-1,m-1}\end {bmatrix}\begin {bmatrix} Q_{1}\\ Q_{2}\\ \vdots \\ Q_{m-1}\end {bmatrix} \]

Let \(x=\begin {bmatrix} Q_{1}\\ Q_{2}\\ \vdots \\ Q_{m-1}\end {bmatrix} \) and let \(B=\begin {bmatrix} P_{1,1} & P_{1,2} & \cdots & P_{1,m-1}\\ P_{2,1} & P_{2,2} & \cdots & P_{2,m-1}\\ \cdots & \vdots & \ddots & \cdots \\ P_{m-1,m} & P_{m-1,2} & \cdots & P_{m-1,m-1}\end {bmatrix} \), and let \(b=\begin {bmatrix} P_{1,m}\\ P_{2,m}\\ \vdots \\ P_{m-1,m}\end {bmatrix} \), then the above can be written as

\begin{align*} x & =b+Bx\\ x-Bx & =b\\ \left ( I-B\right ) x & =b \end{align*}

Where \(I\) is the identity matrix of order \(m-1\). Now let \(A=\left ( I-B\right ) \). hence

\[ Ax=b \]

Therefore we can find \(x\) (which is the \(Q^{\prime }s\)) if we can solve the above. i.e. if we can invert the matrix \(A.\)(i.e. \(A\) is non-singular)

Part(B)

Now we need to show that \(\left ( I-B\right ) \) is invertible. Recall that a Matrix \(A\) is not invertible if we can find a vector \(\mathbf {v\neq 0}\) such that \(A\mathbf {v}=0\).

Let us assume that \(\left ( I-B\right ) \) is not invertible. Hence there exist a vector \(\mathbf {v}\neq \mathbf {0}\) such that

\[ \left ( I-B\right ) \mathbf {v=0}\]

In other words

\begin{equation} \mathbf {v=}B\mathbf {v} \tag {1}\end{equation}

Now we show that it is not possible to find such a vector \(\mathbf {v}\), showing that \(\left ( I-B\right ) \) must therefore has an inverse.

We can always normalized the vector \(\mathbf {v}\) in (1) without changing this relation, hence we assume \(\mathbf {v}\) is normalized such that its largest component \(v_{i}\) has length \(1\) (we do this by dividing the vector by the largest component it had). Now (1) can be written in component form as follows

\begin{equation} v_{i}={\displaystyle \sum \limits _{j=1}^{m-1}} p_{ij}v_{j} \tag {2}\end{equation}

Since \(\mathbf {v}\) is normalized, it will have at least one component which is \(1\) in value, and it can have components which are less than \(1\) in value (we proof this part below). Let the set of the indices of those components of \(\mathbf {v}\) which are \(1\) be the set \(J\) and let the set of the indices of those components which are less than \(1\) be the set \(S\). In other words

\begin{align*} J & =\left \{ i:v_{i}=1\right \} \\ S & =\left \{ i:v_{i}<1\right \} \end{align*}

First, we show that the set \(S\) can not be empty: Proof by contradiction. Assume \(S\) is empty. Hence every element in the vector \(\mathbf {v}\) is \(1\). Let us pick one of these elements \(v_{i}=1\) such that \(i\) corresponds to a row number in the matrix \(B\) where this row happens to sum to a value less than \(one\footnote {We know that we can find such a row in $B$ using the following argument: Assume that there is no row in $B$ which sums to less than $1$. This means $B$ is an irreducable transitional probability matrix. However this is a submatrix of an original probability transition matrix which is irreducable, meaning it has no closed subsets. Hence $B$ can not be irreduacble (closed). Therefore, we can find at least one row in $B$ which sums to less than \thinspace 1. (Matrix $B$ is similar to a $Q$ matrix, it has at least one row which sums to less than $1$).}\). Then we write

\[ 1={\displaystyle \sum \limits _{j=1}^{m-1}} p_{ij}v_{i}={\displaystyle \sum \limits _{j=1}^{m-1}} p_{ij}\left ( 1\right ) ={\displaystyle \sum \limits _{j=1}^{m-1}} p_{ij}\]

But since this row sums to less than one, then the RHS above is less than 1. Hence this is a contradiction, hence the set \(S\) can not be empty.

Now that we showed the set \(S\) is not empty, we can write (2) as a sum over the set \(J\) and the set \(S\) of indices. (We know the set \(J\) is not empty by definition, since the vector \(v\) is normalized, so it will have at least one element in the set \(J\)). Hence (2) becomes

\begin{equation} \fbox {$v_{i}={\displaystyle \sum \limits _{j\in K}} p_{ij}v_{j}+{\displaystyle \sum \limits _{j\in S}} p_{ij}v_{j}$} \tag {3}\end{equation}

Let us again pick one of those \(v_{i}\) components which has value \(1\) (we  know there is at least one of these), and try to see if this equality holds for this row \(i\). So (3) becomes

\[ 1={\displaystyle \sum \limits _{j\in J}} p_{ij}v_{j}+{\displaystyle \sum \limits _{j\in S}} p_{ij}v_{j}\]

But all the \(v_{j}\) in the set \(J\) have value \(1\), so the above can be simplified

\begin{equation} 1={\displaystyle \sum \limits _{j\in J}} p_{ij}+\overset {Y}{\overbrace {{\displaystyle \sum \limits _{j\in S}} p_{ij}v_{j}}} \tag {4}\end{equation}

Now each \(v_{j}\) in the term labeled \(Y\) above is less than \(1\) (since it is the set \(S\)), so this means \({\displaystyle \sum \limits _{j\in S}} p_{ij}v_{j}<{\displaystyle \sum \limits _{j\in S}} p_{ij}\), therefore the sum in (4) could never add to \(1\) if there are values \(p_{ij}\) that are non-zero when \(j\) is in the set \(S\). (since the sum \({\displaystyle \sum \limits _{j\in \left \{ J\cup S\right \} }} p_{ij}\) is being reduced from its original row sum). So for (4) to be satisfied, we need to have all the \(p_{ij}=0\) when \(j\) is in the set \(S\). Hence the sum labeled \(Y\) is zero.

What this means is that if \(v_{i}=1\), then the \(i^{th}\)row in the matrix \(B\) must have zero entries in the columns which correspond to the indices in the set \(S\,.\) As shown in this diagram as an example

In the above diagram, I showed one example of the conclusion of above argument. Of course the set \(J\) the way I drawn it does not have to be ’contiguous’, it could be in any pattern, as say the following

Therefore, we see that \(p_{js}=0\) when \(j\) correspond to a state whose number is the same as the index value in the set \(J\), and \(s\) is a state whose number correspond to a state whose number is the same as the index value in the set \(S.\)

What this means is that it is not possible to reach states that correspond to indices in the set \(J\) from states which correspond to indices in the set \(S.\)

Hence, once the chain is in a state in the set \(J\) it is not possible to leave this set.

But this is the same as saying the matrix \(B\) contains a closed subset. In other words, \(B\) is reducible.  However, this is not possible, since the matrix \(B\) is taken from a subset of a chain which is irreducible, i.e. it contains no closed subsets, but we found at least one such subset.

Therefore, we conclude that our assumption which lead to this is invalid. Therefore, there exist no vector \(\mathbf {v\neq 0}\) such that \(\left ( I-B\right ) \mathbf {v=}0\). Hence \(\left ( I-B\right ) \) does have an inverse. QED

l.374 — TeX4ht warning — \SaveEverypar’s: 3 at \begindocument and 4 \enddocument —