4.10.2 Problem 6.3

Part(A)

Let \(I_{n}\) be an indicator variable defined as

\[ I_{n}=\left \{ \begin {array} [c]{ccc}1 & & when\ \left ( X_{n}=j|X_{0}=i\right ) \\ 0 & & otherwise \end {array} \right . \]

Hence

\[ E\left ( I_{n}\right ) =P\left ( X_{n}=j|X_{0}=i\right ) \]

Now we see that

\[ E\left ( V_{ij}\right ) =E\left ( {\displaystyle \sum \limits _{n=0}^{\infty }} I_{n}\right ) ={\displaystyle \sum \limits _{n=0}^{\infty }} E\left ( I_{n}\right ) ={\displaystyle \sum \limits _{n=0}^{\infty }} P\left ( X_{n}=j|X_{0}=i\right ) \]

Now, let \(b_{ij}\) be entry in matrix \(B\) where \(b_{ij}=E\left ( V_{ij}\right ) \), then the above can be written as

\begin{equation} b_{ij}=P\left ( X_{0}=j|X_{0}=i\right ) \ +\ P\left ( X_{1}=j|X_{0}=i\right ) \ +\ P\left ( X_{2}=j|X_{0}=i\right ) \ +\ P\left ( X_{3}=j|X_{0}=i\right ) \ +\cdots \tag {1}\end{equation}

Which is the same as writing

\[ b_{ij}=P_{ij}^{\left ( 0\right ) }+P_{ij}^{\left ( 1\right ) }+P_{ij}^{\left ( 2\right ) }+P_{ij}^{\left ( 3\right ) }+\cdots \]

When \(i=j\), then \(P_{ij}^{\left ( 0\right ) }=1\) otherwise it is \(0\). Hence

\begin{equation} b_{ij}=\delta _{ij}+P_{ij}^{\left ( 1\right ) }+P_{ij}^{\left ( 2\right ) }+P_{ij}^{\left ( 3\right ) }+P_{ij}^{\left ( 4\right ) }+\cdots \tag {2}\end{equation}

Let the set of transient states be \(T\),  and using chapman-kolmogorov, the above can be written as

\begin{equation} b_{ij}=\delta _{ij}+P_{ij}^{\left ( 1\right ) }+\overset {P_{ij}^{\left ( 2\right ) }}{\overbrace {{\displaystyle \sum \limits _{k\in T}} P_{ik}^{\left ( 1\right ) }P_{kj}^{\left ( 1\right ) }}}+\overset {P_{ij}^{\left ( 3\right ) }}{\overbrace {{\displaystyle \sum \limits _{k\in T}} P_{ik}^{\left ( 2\right ) }P_{kj}^{\left ( 1\right ) }}}+\overset {P_{ij}^{\left ( 4\right ) }}{\overbrace {{\displaystyle \sum \limits _{k\in T}} P_{ik}^{\left ( 3\right ) }P_{kj}^{\left ( 1\right ) }}}+\cdots \tag {2}\end{equation}

But \(\overset {P_{ij}^{\left ( 2\right ) }}{\overbrace {{\displaystyle \sum \limits _{k\in T}} P_{ik}^{\left ( 1\right ) }P_{kj}^{\left ( 1\right ) }}}\) is multiplying the \(i^{th}\) row of the \(Q\) matrix by the \(j^{th}\) column of the \(Q\) matrix. which is the \(\left ( i,j\right ) \) entry of the matrix \(Q^{2}\), and \(\overset {P_{ij}^{\left ( 3\right ) }}{\overbrace {{\displaystyle \sum \limits _{k\in T}} P_{ik}^{\left ( 2\right ) }P_{kj}^{\left ( 1\right ) }}}\) is multiplying the \(i^{th}\) row of the \(Q^{2}\) matrix we just obtained, by the \(j^{th}\) column of the \(Q\) matrix, which is the \(\left ( i,j\right ) \) entry of the matrix \(Q^{3}\). Continue this way, we obtain that \(P_{ij}^{\left ( 4\right ) }\) is the entry \(i,j\) in matrix \(Q^{4}\) and so on.

Hence we see that \(b_{ij}\) is the \(\left ( i,j\right ) \) entry of a matrix resulting from \(I+Q+Q^{2}+Q^{3}+\cdots \)  QED.

Part(B) From part(A), we obtained that \(E\left ( V_{ij}\right ) \) is the \(\left ( i,j\right ) \) entry in the matrix resulting from the sum \(I+Q+Q^{2}+Q^{3}+\cdots \). Since this is a \(Q\) matrix, then we know its elements will all go to zero an \(n\) gets very large, so this is a convergent sum, hence \(I+Q+Q^{2}+Q^{3}+\cdots \).\(\rightarrow \left ( I-Q\right ) ^{-1}.\) Therefore

\(E\left ( V_{ij}\right ) \) is the \(\left ( i,j\right ) \) entry in the matrix \(\left ( I-Q\right ) ^{-1}.\)

Problem 6.5

Part(A) I solve this part in 2 ways, first by conditioning on next state, as required, and then by the counting method explained in the lecture.

by conditioning on next state. Let \(I\) be the set of all states. Then

\begin{equation} E\left ( T_{ij}\right ) ={\displaystyle \sum \limits _{k\in I}} E\left ( T_{ij}|X_{1}=k,X_{0}=i\right ) P\left ( X_{1}=k|X_{0}=i\right ) \tag {1}\end{equation}

But by Markov property, chain state on next step depends only on current state. Hence \(E\left ( T_{ij}|X_{1}=k,X_{0}=i\right ) =E\left ( T_{ij}|X_{1}=k\right ) \) and also since \(P\left ( X_{1}=k|X_{0}=i\right ) =P_{ki}\) then (1) can be written as

\begin{equation} E\left ( T_{ij}\right ) ={\displaystyle \sum \limits _{k\in I}} E\left ( T_{kj}\right ) P_{ki} \tag {2}\end{equation}

Now, when \(X_{1}=j\), then \(E\left ( T_{ij}\right ) =1\) since chain already in state \(j\) after one step. Therefore (2) can be rewritten as

\[ E\left ( T_{ij}\right ) =1+{\displaystyle \sum \limits _{\substack {k\in I\\k\neq j}}} E\left ( T_{ij}|X_{1}=k\right ) P_{ki}\]

But \(E\left ( T_{ij}|X_{1}=k\right ) \) is the same as writing \(E\left ( T_{kj}\right ) \) , so the above becomes

\[ E\left ( T_{ij}\right ) =1+{\displaystyle \sum \limits _{\substack {k\in I\\k\neq j}}} E\left ( T_{kj}\right ) P_{ki}\]

Using the notation shown in the problem, the above becomes

\[ m_{ij}=1+{\displaystyle \sum \limits _{\substack {k\in I\\k\neq j}}} m_{kj}P_{ki}\]

QED.

Now solve part(a) using first a counting argument, and using the following diagram as a guide

Then we write (letting \(E\left ( T_{ij}\right ) =m_{ij}\) and \(E\left ( T_{kj}\right ) =m_{kj}\))

\begin{align*} m_{ij} & =\frac {N\left ( P_{ij}\right ) +{\displaystyle \sum \limits _{k\neq j}} NP_{ik}\left ( m_{kj}+1\right ) }{N}\\ & \\ & =P_{ij}+{\displaystyle \sum \limits _{k\neq j}} P_{ik}\left ( m_{kj}+1\right ) \\ & =P_{ij}+{\displaystyle \sum \limits _{k\neq j}} P_{ik}m_{kj}+{\displaystyle \sum \limits _{k\neq j}} P_{ik}\end{align*}

But \(P_{ij}+{\displaystyle \sum \limits _{k\neq j}} P_{ik}=1\) hence the above becomes

\begin{equation} m_{ij}=1+{\displaystyle \sum \limits _{k\neq j}} P_{ik}m_{kj} \tag {1}\end{equation}

Part(B) We start from the result of part (A) which is

\[ m_{ij}=1+{\displaystyle \sum \limits _{k\neq j}} P_{ik}m_{kj}\]

Multiply both sides by \(w_{i}\) and obtain

\[ w_{i}\ m_{ij}=w_{i}+w_{i}{\displaystyle \sum \limits _{k\neq j}} P_{ik}m_{kj}\]

Sum over all possible states \(i\) and obtain

\begin{equation}{\displaystyle \sum \limits _{i=1}^{r}} w_{i}\ m_{ij}={\displaystyle \sum \limits _{i=1}^{r}} w_{i}+{\displaystyle \sum \limits _{i=1}^{r}} \left ( w_{i}{\displaystyle \sum \limits _{k\neq j}} P_{ik}m_{kj}\right ) \tag {2}\end{equation}

But \({\displaystyle \sum \limits _{i=1}^{r}} w_{i}=1\) and \({\displaystyle \sum \limits _{i=1}^{r}} \left ( w_{i}{\displaystyle \sum \limits _{k\neq j}} P_{ik}m_{kj}\right ) ={\displaystyle \sum \limits _{i=1}^{r}} \left ( {\displaystyle \sum \limits _{k\neq j}} w_{i}P_{ik}m_{kj}\right ) ={\displaystyle \sum \limits _{k\neq j}} m_{kj}\left ( {\displaystyle \sum \limits _{i=1}^{r}} w_{i}P_{ik}\right ) \), hence (2) becomes

\begin{equation}{\displaystyle \sum \limits _{i=1}^{r}} w_{i}\ m_{ij}=1+{\displaystyle \sum \limits _{k\neq j}} m_{kj}\left ( {\displaystyle \sum \limits _{i=1}^{r}} w_{i}P_{ik}\right ) \tag {3}\end{equation}

Now, since \(w=\left \{ w_{1},w_{2},\cdots ,w_{r}\right \} \) is the stationary state vector, then it satisfies the following relation

\[ w=wP \]

Where \(P\) is the one step probability transition matrix. The solution to the above is given by

\begin{equation} w_{k}={\displaystyle \sum \limits _{i=1}^{r}} w_{i}P_{ik} \tag {4}\end{equation}

Where \(k\) is any state. Using (4) into RHS of (3), we can rewrite (3) as

\begin{align}{\displaystyle \sum \limits _{i=1}^{r}} w_{i}\ m_{ij} & =1+{\displaystyle \sum \limits _{\substack {k=1\\k\neq j}}^{r}} m_{kj}w_{k}\nonumber \\ \overset {A}{\overbrace {{\displaystyle \sum \limits _{i=1}^{r}} w_{i}\ m_{ij}}}-\overset {B}{\overbrace {{\displaystyle \sum \limits _{\substack {k=1\\k\neq j}}^{r}} m_{kj}w_{k}}} & =1 \tag {5}\end{align}

Now looking at the LHS, we see that the first sum labeled \(A\) counts for all the \(w^{\prime }s\) and the second sum labeled \(B\) also counts for all the \(w^{\prime }s\) except for the \(j\) term. Hence if we subtract \(B\) from \(A\), only the term \(m_{jj}w_{j}\) will survive. Hence (5) becomes

\[ m_{jj}w_{j}=1 \]

or

\[ \fbox {$m_{jj}=\frac {1}{w_{j}}$}\]

QED.

Part (C)

If we wait for the chain to arrive at its steady state (i.e. we the chain probability state vector does not change, or \(w=wP\)), then we observe the chain from that point on, for a long period of time, say \(T\). The number of times the chain will be in state \(j\) during this time \(T\) is then given by \(w_{j}T\), since \(w_{j}\) is the probability of the chain being in state \(j\). So, to find the average number of time units (steps) it took for the chain for go from state \(j\) back to state \(j\) we need to divide \(T\) by the number of times the chain was in state \(j\) during this time, which we just found as \(w_{j}T\)

Hence

\[ m_{jj}=\frac {T}{w_{j}T}=\frac {1}{w_{j}}\]

Intuitively this makes sense. Since the smaller the probability that the chain will be in state \(j\) we would expect the time between the events that the chain is in state \(j\) to become larger, So the relation should be an inverse one, as was found.  QED

l.417 — TeX4ht warning — \SaveEverypar’s: 3 at \begindocument and 4 \enddocument —