Predicting a linear congruential generator (modular arithmetic)

Samkli

New member
Joined
Jan 8, 2015
Messages
4
A while ago I learnt that most computers uses a linear congruential generator to generate pseudorandom numbers. A linear congruential generator looks like this:

X(n+1) (a * X(n) + b) mod m
where a, b and m are constants and X(0) is the initial seed that starts the generator.


I also learnt that if you know the first few generated pseudorandom numbers it is possible to predict the next number without knowing any of the constants and without knowing X(0). I wanted to see how that was possible and found a proof of it online, but since I don't know very much about modular arithmetic I didn't quite understand some parts of the proof, and would like some help understanding it.

in the proof they started with the linear congruential generator. Then they defined a new sequence of numbers as:

Y(n) = X(n+1) - X(n)

then they say that the reason for this is because if X(n+1) (a * X(n) + b) mod m is true, then Y(n+1) (a * Y(n)) mod m must be true as well. My question is, how do I know this? Or what do I have to know about modular arithmetic in order to prove that equation?
 
A while ago I learnt that most computers uses a linear congruential generator to generate pseudorandom numbers. A linear congruential generator looks like this:

X(n+1) (a * X(n) + b) mod m
where a, b and m are constants and X(0) is the initial seed that starts the generator.


I also learnt that if you know the first few generated pseudorandom numbers it is possible to predict the next number without knowing any of the constants and without knowing X(0). I wanted to see how that was possible and found a proof of it online, but since I don't know very much about modular arithmetic I didn't quite understand some parts of the proof, and would like some help understanding it.

in the proof they started with the linear congruential generator. Then they defined a new sequence of numbers as:

Y(n) = X(n+1) - X(n)

then they say that the reason for this is because if X(n+1) (a * X(n) + b) mod m is true, then Y(n+1) (a * Y(n)) mod m must be true as well. My question is, how do I know this? Or what do I have to know about modular arithmetic in order to prove that equation?
To answer your last question first, two numbers, x, y, are equal, "modulo m", if they have the same remainder when divided by m. That is equivalent to saying that x- y has remainder 0 when divided by m: if x, divided by m, has quotient p with remainder r, then x= mp+ r, and if y, divided by m has quotient q with remainder r, then y= mq+ r so that x- y= (mp+ r)- (mq+ r)= mp- mq= m(p- q).

If X(n+1)= aX(n)+ b (mod m) and X(n)= aX(n-1)+ b (mod m), then Y(n)= X(n+1)- X(n)= (aX(n)+ b)- (aX(n- 1)- b)= a(X(n+1)- X(n))= aY(n) (mod m)

(Note that all of that is just basic arithmetic with the (mod m) added on the end.)
 
A while ago I learnt that most computers uses a linear congruential generator to generate pseudorandom numbers. A linear congruential generator looks like this:

X(n+1) (a * X(n) + b) mod m
where a, b and m are constants and X(0) is the initial seed that starts the generator.


I also learnt that if you know the first few generated pseudorandom numbers it is possible to predict the next number without knowing any of the constants and without knowing X(0). I wanted to see how that was possible and found a proof of it online, but since I don't know very much about modular arithmetic I didn't quite understand some parts of the proof, and would like some help understanding it.

in the proof they started with the linear congruential generator. Then they defined a new sequence of numbers as:

Y(n) = X(n+1) - X(n)

then they say that the reason for this is because if X(n+1) (a * X(n) + b) mod m is true, then Y(n+1) (a * Y(n)) mod m must be true as well. My question is, how do I know this? Or what do I have to know about modular arithmetic in order to prove that equation?

The proof uses
(a + b) mod(m) = a mod(m) + b mod(m)
and the fact that a-c = a+[(-1)*c] = a + b where b=-c. To show this in an example [NOT as a proof], we first note that that
a = c mod(m)
means there is some integer n so that
a = n m + c
Than is, if you divide a by m you will have a remainder of c. Note that if a = n m + c, a = (n-j) m + (c + j m) so you can always add (or subtract) multiples of m to c for other numbers of congruency. Generally at the end of the problem we reduce c so that it is non-negative and less than m. However, that isn't necessary unless directed by the set of rules you are working under (an instructors direction for example).

Now suppose m is 20, a = 53, and b is 214. Then a + b = 267 and
a = 13 mod(20)
b = 14 mod(20)
a+b = 7 mod(20)
13 + 14 = 27 = 7 mod(20)
 
Thanks, I understand it now. Was planning to ask about another part of the proof that I didn't quite understand either, but I figured it out myself now.
 
Ran into another thing I don't understand now. If we use the constants that Ishuda used above in his example (a = 53, b = 214, m = 20) we get the following generator:

X(n+1) (53 * X(n) + 214) mod 20

If we use the initial seed 0 our first four generated pseudorandom numbers would be:

X(1) = 14
X(2) = 16
X(3) = 2
X(4) = 0

And if we let Y(n) = X(n+1) - X(n) we would get that:

Y(1) = 2
Y(2) = -14
Y(3) = -2

Since we know that Y(n+1) (a * Y(n)) mod m, we also know that Y(n) * Y(n+2) Y(n+1)^2 mod m and that m divides Y(n) * Y(n+2)) - Y(n+1)^2. If we didn't already know m I would now guess that m = Y(n) * Y(n+2)) - Y(n+1)^2 and call the guess m(0). I would then guess that a = Y(n+1)/Y(n) mod m(0) and call the guess a(0). In this case I would get that:

m(0) = Y(1) * Y(3) - Y(2)^2 = 2 * -2 - (-14)^2 = -200
a(0) = -14/2 mod -200

Since this guess clearly is wrong I wonder, how do I update my guess m(0)? I heard from someone that if the greatest common divisor of Y(2) and m(0) is greater than 1, then I should update my guess of m to m(0) / (m(0), Y). But how does that help me to get any closer to guessing the correct value of m?
 
Last edited:
Ran into another thing I don't understand now. ...
I don't know the answer to your question but I would play around with it a different way: First read = as congruent mod(m). We know, from X(1) that
b = 14
and m is greater than 14 (or 14 would have been reduced by m). So, since X(n+1) = a X(n) + b, we have a X(n) = X(n+1) - b or
14 a = 2
16 a = -14
2 a = -14
0 = 0
14 a = 2
...
By that same reasoning of above, we have m greater than 16.

Playing with the above we have
14 a = 7 * (-14) = -98 = 2
or, using that last 'equality' we have m is a divisor of 100. Thus m = 20, 25, or 50. And continue from there.

In reality it wouldn't be that easy but then I haven't really looked into methods to obtain the a, b, and m for LCGs.
 
Will try to type out what the proof I found says in more detail in case that will make it easier for someone to help me understand how to update my guess of m.

If we guess that m(0) = Y(1) * Y(3) - Y(2)^2 and that a(0) = Y(n+1)/Y(n) mod m(0) we can now take a look at the following generator:
___......... _____........................ ___.............................................. ___
Y(n) a(0)Y(n-1) mod m(0) where Y(1) = Y(1) .................. (what does Y(n) even mean in this case?)
...___................................................................................._ ..........................................................___
If Y(n) = Y(n) for every n we probably have the right generator. Otherwise since m|m(0) we know that a(0)Y(n) aY(n) mod m,
..............................._____..................................................................______
and therefor also that Y(n+1) = Y(n+1) mod m which means that m divides Y(n+1) - Y(n+1). Therefor we can update our guess
....................................................................._____
of m to the greatest common divisor of m(0) and Y(n+1) - Y(n+1). Then we repeat the process until we find the correct value of m.


Further down it says that when calculating a(0) with the euclidean algorithm we sometimes get a Y(1) and Y(2) such that Y(1)/Y(2) mod m(0) is undefined. This means that our guess m(0) of m is too big. If (Y(1), Y(2)) = 1 it's because (Y(2), m(0)) > 1 and we should update our guess of m(0) to m(0) / (m(0), Y). Sometimes we have to do this multiple times, but eventually we will find the biggest factor of m(0) where Y(1)/Y(2) exists. However I still don't quite understand how I should update my guess of m(0) in the example I posted above.


(I hope my attempt at adding the line above certain variables didn't screw up anything)
 
The line over the Y [which can be read as Y bar] just means an estimate of the given Y. We can get around that by using a small y to indicate the estimate.

So pick an estimate of a and m [= a(0) and m(0)] and set y(1) = Y(1). Now generate a set of estimates of Y [a set of y's] by using
y(n+1) = a(0) * y(n) mod(m(0))
If y(n) = Y(n) for all n [if our estimates are actually the given values] then the estimates a(0) and m(0) are pretty darn good and we can stop.

I'll look at the rest of it later or maybe someone else will comment.
 
Samkli
First
Y(n) = X(n) - X(n-1) = a X(n-1) + b - (a X(n-2) + b + m c(n-2)) = a (X(n-1) - X(n-2)) - m c(n-2) = a Y(n-1) - m c(n-2)
where c(n-2) is unknown and just the multiple of n to make things work out. That is
Y(n) = a Y(n-1) mod(m)
Now consider
p(n-1) = Y(n-1) * Y(n+1) - Y2(n) = Y(n-1) [ a Y(n) + m d(n)] - [a Y(n-1) + m d(n-1)]2 = m [ Y(n-1) (d(n) - d(n-1) + m d(n-1)]
where the d's play the same role for the Y's that the c's did for the X's. Thus we have p(n) is divisible by m as is -p(n). So we can do a trial run by considering m(0) = min{|p(n}|. For a value of a we note that we also have
Y(n) = (a + j m) Y(n-1) mod(m)
so we can consider a is (enough) relatively prime to m and the c's that a = Y(n)/Y(n-1) for some n plus the first multiple of m to make the number positive. For, the following example, this also says something about the original generating function m = 20 and a = 53. Since any a = 53 mod(20) would have worked, a should really be less than m and the original a of the generating function would be 13.

Take the example:
X = {14, 16, 2, 0, 14, ...}
Y = {2, -14, -2, 14, 2, ...}
p = {-200, -200, -200, ...}
So m(o) = 200 and a(0) = -7 mod(200) = 193. Let y(n) be the estimate for Y(n) with y(1) = Y(1), then
y = {2, 186, 98, 114, 2, ...}
and
Y-y = {0, 200, 100, 100, 0, ...}
It can be show that Y(n) - y(n) is divisible by m, so, as an update, pick the smallest Y(n)-y(n) [or, as the write up has it, the largest divisor of 100 and 200].
So m(1) = 100 and a=-7 mod(100)= 93:
y = {2, 86, 98, 14, 2, 86, ...}
Y - y = {0, 100, 100, 0, 0, 100, ...}
But 100 is zero mod(100), so looks like we found our a and m. Hah! If we try m=50, a=43; m=25, a=18; m=20 and a=13 we will find the same thing, Y-y = 0 mod(m). This may or may not be a result of the randomly picked a and m. I'm not sure how it would work with an actual a and m from an LCG.
 
Top