Question about Statistic Anomaly

Rickdan

New member
Joined
May 2, 2016
Messages
2
Hi!

I know that you advise to speak a good english, but thats not my mother tongue, so I hope you apologize any mistakes from me.

I was trying to see check how random Excel is generating numbers, because I read that Excel is not a real random generator. Very recently someone showed me a paper speaking about Excel & randomness, but I still want to test it. So, my problem here is not really to discover about Excel itself but more about the maths and how can I find it by myself.

So, to test the randomness of Excel, I generated lots of numbers so I can check if there is any "anomaly" from what should be expected by chance. Here is what I did:


- I have lots of cells, near 695505, each one generating a number from 1 to 4.
- I counted how many numbers 1 were generated, and numbers 2, 3 and 4. For example, from the near 700000 random numbers, 25,07% are numbers 1 , 24,94% are the number 2 , 24,98% are number 3 and 25,01% are the number 4 . I also calculated the SD (standard deviation): 0,047 and then how many standard deviations each result was from the mean (the mean is always exactly 25%): for number 1, -1,47 SDs away from the mean; 1,26 for number 2; 0,44 for number 3 and -0,23 for number 4.


What should I do next, and how? How can I read and interpret the data so I can check if there is any anomaly? I mean, we should expect by chance that all the numbers were 25% (exactly25,0000...) from the total. Should I calculate the magnitude size effect? An how would I do that in this case?


Thanks for any insights about this
 
Hi!

I know that you advise to speak a good english, but thats not my mother tongue, so I hope you apologize any mistakes from me.

I was trying to see check how random Excel is generating numbers, because I read that Excel is not a real random generator. Very recently someone showed me a paper speaking about Excel & randomness, but I still want to test it. So, my problem here is not really to discover about Excel itself but more about the maths and how can I find it by myself.

So, to test the randomness of Excel, I generated lots of numbers so I can check if there is any "anomaly" from what should be expected by chance. Here is what I did:


- I have lots of cells, near 695505, each one generating a number from 1 to 4.
- I counted how many numbers 1 were generated, and numbers 2, 3 and 4. For example, from the near 700000 random numbers, 25,07% are numbers 1 , 24,94% are the number 2 , 24,98% are number 3 and 25,01% are the number 4 . I also calculated the SD (standard deviation): 0,047 and then how many standard deviations each result was from the mean (the mean is always exactly 25%): for number 1, -1,47 SDs away from the mean; 1,26 for number 2; 0,44 for number 3 and -0,23 for number 4.


What should I do next, and how? How can I read and interpret the data so I can check if there is any anomaly? I mean, we should expect by chance that all the numbers were 25% (exactly25,0000...) from the total. Should I calculate the magnitude size effect? An how would I do that in this case?


Thanks for any insights about this

Sorry about delayed approval ...
 
Well, the main factor that determines why numbers generated by Excel aren't distributed perfectly uniformly is due to the nature of randomness. Say you flip a million coins. You'd expect to see 500,000 heads and 500,000 tails. In reality, one or the other might be ahead by a small margin. The same is true for random numbers. If you roll a 4-sided dice a million times, you'll see "hot streaks" where a specific number shows up many times in a row, skewing the results off of the expected distribution.

As for the specifics of Excel's random numbers, per this page from Microsoft: Prior to 1993, the first random number generated by Excel was always fixed, and it was 0.711327. Subsequent random numbers were then generated by this formula:

Random number = (9821 * [Previous Random Number] + 0.211327) % 1, where % denotes the modulus operator, or taking the remainder after division. In this case, it takes only the part of the number after the decimal point.

Versions of Excel made after 1993 still use the same formula, but the first random number is now "determined from the system clock."
 
Subhotosh Khan, dont worry about the dealy. I am thankful that you approved my topic :).

ksdhart2: Thanks for your answer, very useful for learning about excel randomness, but I would like to know the maths behind my question. In fact, my real intereste now is not about Excel (is it random? is it not? and so on) but about my specific question: how can I look to the data and say "There is an anomaly [or not] because of this or that."?

Thanks all of you once again :)
 
Subhotosh Khan, dont worry about the dealy. I am thankful that you approved my topic :).

ksdhart2: Thanks for your answer, very useful for learning about excel randomness, but I would like to know the maths behind my question. In fact, my real intereste now is not about Excel (is it random? is it not? and so on) but about my specific question: how can I look to the data and say "There is an anomaly [or not] because of this or that."?

Thanks all of you once again :)
That will require rigorous courses in applied statistics - may be lasting several semesters. As a matter of fact - answering that question is the main job of statisticians.
 
Top