Correlation among independently uniformly distributed variables

Kurama · Mar 10, 2022

Suppose each student takes a math and science test with scores ranging from 0 to 100. If the tests are said to be independently uniformly distributed random variables ranging from 0 to 100, then does the standard correlation formula (Cov(X,Y)/(sd(x)*sd(y)) apply? Or is the correlation necessarily 0 because they are said to be "independently uniformly distributed"?

Alternatively, how would this apply for a subset of those who have a combined score of greater than 100?

BigBeachBanana · Mar 10, 2022

Start with the Covariance formula:

Cov(X,Y)=E(XY)-E(X)E(Y)

But due to independence, we know that

E(XY)=E(X)E(Y)

Therefore,

Cov(X,Y)=E(XY)-E(X)E(Y)=E(X)E(Y)-E(X)E(Y)=0

It follows that the correlation

Corr(X,Y)=\frac{Cov(X,Y)}{SD(X)SD(Y)}=\frac{0}{SD(X)SD(Y)}=0

It should make intuitive sense that if two random variables are independent then they have no correlation i.e. 0.

Kurama · Mar 11, 2022

Thanks! I guess what was tripping me up is that based on the language of the prompt I can't tell if it was declaring that the distributions are independent of each other or if it meant that each student's score was independent of other student's score and not necessarily that the two distributions are independent of each other.

BigBeachBanana · Mar 11, 2022

Kurama said:
Thanks! I guess what was tripping me up is that based on the language of the prompt I can't tell if it was declaring that the distributions are independent of each other or if it meant that each student's score was independent of other student's score and not necessarily that the two distributions are independent of each other.

It helps to define what X and Y represent. X is the distribution of student 1's score uniformly over [0,100], and Y is the distribution of student 2's uniformly over [0,100]. Since student 1's score and student's 2 scores are independent, meaning they did not cheat or collaborate in any way on the test, the results of their test scores are not correlated. If there is a correlation, it might suggest that those students are collaborating.

Kurama · Mar 11, 2022

So what I meant X and Y to represent math scores and science scores respectively. So each student is taking two tests, one Math (X) and one science (Y). We are assuming that X and Y independently distributed.

BigBeachBanana · Mar 11, 2022

Kurama said:
So what I meant X and Y to represent math scores and science scores respectively. So each student is taking two tests, one Math (X) and one science (Y). We are assuming that X and Y independently distributed.

If that's the case then the independence assumption implies that the score a student received on the math test does not impact the grade of the science test. However, we know that it's not entirely true since it's the same student taking the tests. There's going to be some interdependencies such as cognitive abilities, overlapping material/skillsets being tested on both exams, the possibility of the same teacher, etc...
I'm unsure of the context of the question. Is this a research, predictive modelling, homework question, etc...?
Anyhow, realistically, it wouldn't be a good assumption to make, but it does simplify a lot of the calculations.

Correlation among independently uniformly distributed variables

Kurama

New member

BigBeachBanana

Senior Member

Kurama

New member

BigBeachBanana

Senior Member

Kurama

New member

BigBeachBanana

Senior Member