3X3 matrix, Markov chain: Calculate P^100 analytically...

ilovepanda

New member
Joined
Jan 31, 2016
Messages
3
Hello everyone, I'm having some trouble resolving this problem:

Consider the following 3x3 matrix P:
0.5 0.3 0.2
0.2 0.2 0.6
0.4 0.4 0.2

- Calculate P^100 analytically but quickly (show the steps - no Matlab)
- Assume this matrix corresponds to a (discrete time) markov chain where entry Pij shows the probability of going from state i to state j. What is the stationary distribution for this chain (if any)?
- How fast does the chain converge to the stationary distribution? (assume the initial distribution is (1,0,0) and calculate some difference metric between the initial distribution and the derived stationary distribution).

Any ideas?
 
Hello everyone, I'm having some trouble resolving this problem:

Consider the following 3x3 matrix P:
0.5 0.3 0.2
0.2 0.2 0.6
0.4 0.4 0.2

- Calculate P^100 analytically but quickly (show the steps - no Matlab)
- Assume this matrix corresponds to a (discrete time) markov chain where entry Pij shows the probability of going from state i to state j. What is the stationary distribution for this chain (if any)?
- How fast does the chain converge to the stationary distribution? (assume the initial distribution is (1,0,0) and calculate some difference metric between the initial distribution and the derived stationary distribution).

Any ideas?
What are your thoughts?

Please share your work with us ...even if you know it is wrong

If you are stuck at the beginning tell us and we'll start with the definitions.

You need to read the rules of this forum. Please read the post titled "Read before Posting" at the following URL:

http://www.freemathhelp.com/forum/th...Before-Posting
 
Subhotosh Khan, thanks for replying. I'm preparing an exam and I found this problem in the previous year's documents. I've been trying to solve it, thinking about it ever since. So firstly, let me tell you what I got so far.


- P100 = P4 x P32 x P64 =
0.376 0.301 0.32
0.376 0.301 0.32
0.376 0.301 0.32

- Def: "The stationary distribution of a markov chain with transition matrix P is some vector, x, such that xP=x." Then we multiply x = [x1 x2 x3] with the matrix P from the beginning, and get the value of the vector (the stationary distribution): x=(0.377, 0.301, 0.32).

- The last question is the one causing trouble.

Firstly, how do I calculate convergence speed from (1,0,0) to (0.377, 0.301, 0.32)?
Secondly, what are the difference metrics? I've only heard of "Total Variation Distance", but unfortunately, "I've only heard of it" is really all I know.

I would be tremendously grateful to any guidance/help you could provide me. Thanks!


 
Last edited:
Subhotosh Khan, thanks for replying. I'm preparing an exam and I found this problem in the previous year's documents. I've been trying to solve it, thinking about it ever since. So firstly, let me tell you what I got so far.

0.376 0.301 0.32
- P100 = P4 x P32 x P64 = 0.376 0.301 0.32
0.376 0.301 0.32

- Def: "The stationary distribution of a markov chain with transition matrix P is some vector, x, such that xP=x." Then we multiply x = [x1 x2 x3] with the matrix P from the beginning, and get the value of the vector (the stationary distribution): x=(0.377, 0.301, 0.32).

- The last question is the one causing trouble.

Firstly, how do I calculate convergence speed from (1,0,0) to (0.377, 0.301, 0.32)?
Secondly, what are the difference metrics? I've only heard of "Total Variation Distance", but unfortunately, "I've only heard of it" is really all I know.

I would be tremendously grateful to any guidance/help you could provide me. Thanks!


Have you learned about eigen values and eigen vectors yet?
 

In order to solve the problem in an efficient way, I think the use of eigenvalues and eigenvectors are essential. The concept and use of these quantities are in themselves a subject of study. I think
https://www.math.hmc.edu/math40-01/math40-lect14.pdf
makes for a good introduction to eigenvalue and eigenvectors.

I'll do a brief discussion and use 2X2's for examples to keep things simple: Let
A = \(\displaystyle \begin{pmatrix}
a_{11}& a_{12}\\ a_{21}& a_{22}\end{pmatrix}\)
The idea behind raising a matrix to a large power is that, for example, we can write
A2 = \(\displaystyle A \begin{pmatrix}
a_{11}& 0\\ a_{21}& 0\end{pmatrix}\, +\, A \begin{pmatrix}
0& a_{12}\\ 0& a_{22}\end{pmatrix}\) = \(\displaystyle \begin{pmatrix}
b_{11}& b_{12}\\ b_{21}& b_{22}\end{pmatrix}\)
A3 = \(\displaystyle A \begin{pmatrix}
b_{11}& 0\\ b_{21}& 0\end{pmatrix}\, +\, A \begin{pmatrix}
0& b_{12}\\ 0& b_{22}\end{pmatrix}\) = \(\displaystyle \begin{pmatrix}
c_{11}& c_{12}\\ c_{21}& c_{22}\end{pmatrix}\)
and so forth. Thus we only need to do matrix multiplication like
\(\displaystyle A \begin{pmatrix}
x_{11}& 0\\ x_{21}& 0\end{pmatrix}\)
or
\(\displaystyle A \begin{pmatrix}
0& b_{12}\\ 0& b_{22}\end{pmatrix}\)
which can both be reduced to matrix multiplication like
\(\displaystyle A \begin{pmatrix}
x_{11}\\ x_{21}\end{pmatrix}\)

We now come to the idea of eigenvalues and eigenvectors. Without going into details, for which see the above link to start with, for each mXm square matrix there are m eigenvalues \(\displaystyle \lambda_i,\, i=1,2,3,...,m\) and corresponding eigenvectors \(\displaystyle v_i,\, i=1,2,3,...,m\) such that
A vi = \(\displaystyle \lambda_i\) vi
If the eigenvalues are all distinct and non-zero, the eigenvectors span the space. That is for any vector [in our 2X2 example space]
x = \(\displaystyle \begin{pmatrix}x_{11}\\ x_{21}\end{pmatrix}\)
there exists coefficients a1 and a2 such that
x = a1 v1 + a2 v2.
Well now start multiplying by A
A x = A (a1 v1 + a2 v2) = a1 Av1 + a2 Av2 = a1 \(\displaystyle \lambda_1\)v1 + a2 \(\displaystyle \lambda_2\)v2
Multiply by A again
A2 x = a1 \(\displaystyle \lambda_1^2\)v1 + a2 \(\displaystyle \lambda_2^2\)v2
...
An x = a1 \(\displaystyle \lambda_1^n\)v1 + a2 \(\displaystyle \lambda_2^n\)v2


Lets assume distinct eigenvalues with \(\displaystyle |\lambda_1|\, \gt\, |\lambda_2|\), then
An x = \(\displaystyle \lambda_1^n\, \left[a_1 v_1 + a_2 \left(\frac{\lambda_2}{\lambda_1}\right)^n\, v_2\right]\)
What happens to that
\(\displaystyle \left(\frac{\lambda_2}{\lambda_1}\right)^n\)
as n becomes large? What then happens to An x? What then happens to An?

EDIT: Fix grouping, i.e. I made a dumb mistake and fixed it.
 
Last edited:
Need help

Subhotosh Khan, thanks for replying. I'm preparing an exam and I found this problem in the previous year's documents. I've been trying to solve it, thinking about it ever since. So firstly, let me tell you what I got so far.


- P100 = P4 x P32 x P64 =
0.376 0.301 0.32
0.376 0.301 0.32
0.376 0.301 0.32

- Def: "The stationary distribution of a markov chain with transition matrix P is some vector, x, such that xP=x." Then we multiply x = [x1 x2 x3] with the matrix P from the beginning, and get the value of the vector (the stationary distribution): x=(0.377, 0.301, 0.32).

- The last question is the one causing trouble.

Firstly, how do I calculate convergence speed from (1,0,0) to (0.377, 0.301, 0.32)?
Secondly, what are the difference metrics? I've only heard of "Total Variation Distance", but unfortunately, "I've only heard of it" is really all I know.

I would be tremendously grateful to any guidance/help you could provide me. Thanks!



I have been given the same question, If u solved it completely, I will be thankful if you share the entire procedure.
 
@ilovepanda, have you done with answer, If so, kindly share, i am also need of answer for the same question
 
@ilovepanda, have you done with answer, If so, kindly share, i am also need of answer for the same question
Please reply with your thoughts and efforts so far, based on what you've gone in class and what you've read in previous helps and hints within this thread. Thank you! ;)
 
Please reply with your thoughts and efforts so far, based on what you've gone in class and what you've read in previous helps and hints within this thread. Thank you! ;)
I have attached the answers of 1st two parts, am also confused for the third part,

Do i have to multiply the initial distribution with transition matrix and the result will be multiplied again with transition matrix, once we get the stationary distribution vector, we say that it is converging.
(1 0 0)(.5 .5 .4) the result will be multiplied again with P, until we get the stationary distribution vector, If we get
.3 .2 .4 after 3 multiplications, then how we will define convergence
.2 .6 .2
 

Attachments

  • Stationay distribution.jpg
    Stationay distribution.jpg
    12.2 KB · Views: 12
  • P100-1.jpg
    P100-1.jpg
    25.9 KB · Views: 13
Top