Hello everyone,
This is not homework, it's a real problem I encountered in data analysis. I'll try to state the problem as clearly as possible, while removing extraneous information about the specific application.
Statement of the Problem
I have two random vectors Ai and Bj. Each of element of each vector is a measurement of the same physical quantity (A or B), but each comes from a different data subset. So i and j both index the N data subsets that I have available to me. I assume that the noise properties of these subsets are more or less the same, because there is nothing else I can do (I didn't conduct a whole ensemble of identical experiments to measure A and B; I have only one dataset from one experiment). I also assume that each of the Ai values is independent from any of the others, and likewise for each of the Bj values. I estimate A and B by computing unweighted means over the subsets:
Aˉ=N1i=1∑NAi
Bˉ=N1j=1∑NBj
What I would like to know is: what is the covariance of these mean quantities in terms of the sample covariance of the two vectors? I.e. what is Cov(Aˉ,Bˉ)=Cov(Bˉ,Aˉ)?
My Attempt at a Solution
I start with the definition of covariance, and find that
(1)
\(\displaystyle \displaystyle \begin{aligned}\mathrm{Cov}\left(\bar{A}, \bar{B}\right) &\equiv E\left[\bar{A}\bar{B}\right] - E\left[\bar{A}\right]E\left[\bar{B}\right]\\
&= E\left[\left(\frac{1}{N} \sum_{i=1}^N A^i\right)\left(\frac{1}{N} \sum_{j=1}^N B^j\right)\right] - E\left[\frac{1}{N} \sum_{i=1}^N A^i\right]E\left[\frac{1}{N} \sum_{j=1}^N B^j\right]
\end{aligned} \)
First term in equation (1):
The expectation value is a linear operator, thus the first term can be written as:
N21E[(i=1∑NAi)(j=1∑NBj)]
I think that the product of every term in the left summation with every term in the right summation simply produces a summation of all possible pairwise products (*this statement should be checked*). This can be written as:
N21E[i=1∑Nj=1∑NAiBj]=N21i=1∑Nj=1∑NE[AiBj]
where the righthand side follows from the linearity of the expectation operator.
Second term in equation (1)
By analogy with the manipulations above, the second term can be re-written as:
−N21E[i=1∑NAi]E[j=1∑NBj]=−N21i=1∑NE[Ai]j=1∑NE[Bj]=−N21i=1∑Nj=1∑NE[Ai]E[Bj]
Combining the two terms
(2)
\(\displaystyle \displaystyle \begin{aligned}\mathrm{Cov}\left(\bar{A}, \bar{B}\right) &= \frac{1}{N^2} \sum_{i=1}^N \sum_{j=1}^N \left( E\left[A^i B^j\right] - E\left[A^i\right] E\left[B^j\right]\right)\\
&= \frac{1}{N^2}\sum_{i=1}^N \sum_{j=1}^N \mathrm{Cov}\left(A^i, B^j\right)
\end{aligned}\)
where the last line follows by the definition of covariance. I think that Eq. (2) is the formally-correct, general answer. But, practically speaking, how would I compute this from the data? Again, it would seem that I have no choice but to assume that the covariance between A and B is the same regardless of which data subset each value comes from, perhaps because the noise properties of all the subsets are the same. Therefore I can estimate it from the sample covariance σAB:
Cov(Ai,Bj)≈σAB≡N−11i=1∑N(Ai−Aˉ)(Bi−Bˉ), ∀ i,j
Substituting this into Eq. (2) produces:
(3)
Cov(Aˉ,Bˉ)=N21i=1∑Nj=1∑NσAB=N21(N2σAB)=σAB
This is a very curious result. So to get the covariance of means, you don't divide the off-diagonal elements of the sample covariance matrix of the two vectors by anything??? At first I thought that this could make sense, because it's saying that if A and B are correlated, then averaging together several measurements of them doesn't reduce this correlation. But it seems like it cannot be correct for two reasons
Can anyone find a problem with the calculation steps or the assumptions above that would resolve this issue?
This is not homework, it's a real problem I encountered in data analysis. I'll try to state the problem as clearly as possible, while removing extraneous information about the specific application.
Statement of the Problem
I have two random vectors Ai and Bj. Each of element of each vector is a measurement of the same physical quantity (A or B), but each comes from a different data subset. So i and j both index the N data subsets that I have available to me. I assume that the noise properties of these subsets are more or less the same, because there is nothing else I can do (I didn't conduct a whole ensemble of identical experiments to measure A and B; I have only one dataset from one experiment). I also assume that each of the Ai values is independent from any of the others, and likewise for each of the Bj values. I estimate A and B by computing unweighted means over the subsets:
Aˉ=N1i=1∑NAi
Bˉ=N1j=1∑NBj
What I would like to know is: what is the covariance of these mean quantities in terms of the sample covariance of the two vectors? I.e. what is Cov(Aˉ,Bˉ)=Cov(Bˉ,Aˉ)?
My Attempt at a Solution
I start with the definition of covariance, and find that
(1)
\(\displaystyle \displaystyle \begin{aligned}\mathrm{Cov}\left(\bar{A}, \bar{B}\right) &\equiv E\left[\bar{A}\bar{B}\right] - E\left[\bar{A}\right]E\left[\bar{B}\right]\\
&= E\left[\left(\frac{1}{N} \sum_{i=1}^N A^i\right)\left(\frac{1}{N} \sum_{j=1}^N B^j\right)\right] - E\left[\frac{1}{N} \sum_{i=1}^N A^i\right]E\left[\frac{1}{N} \sum_{j=1}^N B^j\right]
\end{aligned} \)
First term in equation (1):
The expectation value is a linear operator, thus the first term can be written as:
N21E[(i=1∑NAi)(j=1∑NBj)]
I think that the product of every term in the left summation with every term in the right summation simply produces a summation of all possible pairwise products (*this statement should be checked*). This can be written as:
N21E[i=1∑Nj=1∑NAiBj]=N21i=1∑Nj=1∑NE[AiBj]
where the righthand side follows from the linearity of the expectation operator.
Second term in equation (1)
By analogy with the manipulations above, the second term can be re-written as:
−N21E[i=1∑NAi]E[j=1∑NBj]=−N21i=1∑NE[Ai]j=1∑NE[Bj]=−N21i=1∑Nj=1∑NE[Ai]E[Bj]
Combining the two terms
(2)
\(\displaystyle \displaystyle \begin{aligned}\mathrm{Cov}\left(\bar{A}, \bar{B}\right) &= \frac{1}{N^2} \sum_{i=1}^N \sum_{j=1}^N \left( E\left[A^i B^j\right] - E\left[A^i\right] E\left[B^j\right]\right)\\
&= \frac{1}{N^2}\sum_{i=1}^N \sum_{j=1}^N \mathrm{Cov}\left(A^i, B^j\right)
\end{aligned}\)
where the last line follows by the definition of covariance. I think that Eq. (2) is the formally-correct, general answer. But, practically speaking, how would I compute this from the data? Again, it would seem that I have no choice but to assume that the covariance between A and B is the same regardless of which data subset each value comes from, perhaps because the noise properties of all the subsets are the same. Therefore I can estimate it from the sample covariance σAB:
Cov(Ai,Bj)≈σAB≡N−11i=1∑N(Ai−Aˉ)(Bi−Bˉ), ∀ i,j
Substituting this into Eq. (2) produces:
(3)
Cov(Aˉ,Bˉ)=N21i=1∑Nj=1∑NσAB=N21(N2σAB)=σAB
This is a very curious result. So to get the covariance of means, you don't divide the off-diagonal elements of the sample covariance matrix of the two vectors by anything??? At first I thought that this could make sense, because it's saying that if A and B are correlated, then averaging together several measurements of them doesn't reduce this correlation. But it seems like it cannot be correct for two reasons
- In the case of a diagonal element, instead of an off-diagonal one, it doesn't reduce to the well-known result that Cov(Aˉ,Aˉ)=Var(Aˉ)≈σA2/N. If I substitute σA2 for σAB in Eq. (3), the N-dependence will still cancel out, when it really shouldn't. There are fewer unique terms in the double sum of (2) in this instance, but I haven't been able to figure out if that remedies this problem.
- It doesn't agree with a sim I ran that suggests that all elements of the sample covariance matrix of the two vectors are divided by N
Can anyone find a problem with the calculation steps or the assumptions above that would resolve this issue?