Covariance of the Sample Mean of Two Random Vectors

j-astron · Nov 13, 2018

Hello everyone,

This is not homework, it's a real problem I encountered in data analysis. I'll try to state the problem as clearly as possible, while removing extraneous information about the specific application.

Statement of the Problem

I have two random vectors

\displaystyle A^i

and

\displaystyle B^j

. Each of element of each vector is a measurement of the same physical quantity (

\displaystyle A

or

\displaystyle B

), but each comes from a different data subset. So

\displaystyle i

and

\displaystyle j

both index the

\displaystyle N

data subsets that I have available to me. I assume that the noise properties of these subsets are more or less the same, because there is nothing else I can do (I didn't conduct a whole ensemble of identical experiments to measure

\displaystyle A

and

\displaystyle B

; I have only one dataset from one experiment). I also assume that each of the

\displaystyle A^i

values is independent from any of the others, and likewise for each of the

\displaystyle B^j

values. I estimate

\displaystyle A

and

\displaystyle B

by computing unweighted means over the subsets:

\displaystyle \displaystyle \bar{A} = \frac{1}{N}\sum_{i=1}^N A^i

\displaystyle \displaystyle \bar{B} = \frac{1}{N}\sum_{j=1}^N B^j

What I would like to know is: what is the covariance of these mean quantities in terms of the sample covariance of the two vectors? I.e. what is

\displaystyle \mathrm{Cov}(\bar{A}, \bar{B}) = \mathrm{Cov}(\bar{B}, \bar{A})

?

My Attempt at a Solution

I start with the definition of covariance, and find that

(1)

$\displaystyle \displaystyle \begin{aligned}\mathrm{Cov}\left(\bar{A}, \bar{B}\right) &\equiv E\left[\bar{A}\bar{B}\right] - E\left[\bar{A}\right]E\left[\bar{B}\right]\\
&= E\left[\left(\frac{1}{N} \sum_{i=1}^N A^i\right)\left(\frac{1}{N} \sum_{j=1}^N B^j\right)\right] - E\left[\frac{1}{N} \sum_{i=1}^N A^i\right]E\left[\frac{1}{N} \sum_{j=1}^N B^j\right]
\end{aligned} $

First term in equation (1):

The expectation value is a linear operator, thus the first term can be written as:

\displaystyle \displaystyle \frac{1}{N^2}E\left[\left( \sum_{i=1}^N A^i\right)\left( \sum_{j=1}^N B^j\right)\right]

I think that the product of every term in the left summation with every term in the right summation simply produces a summation of all possible pairwise products (*this statement should be checked*). This can be written as:

\displaystyle \displaystyle \frac{1}{N^2}E\left[\sum_{i=1}^N\sum_{j=1}^N A^i B^j\right] = \frac{1}{N^2}\sum_{i=1}^N\sum_{j=1}^N E\left[A^i B^j\right]

where the righthand side follows from the linearity of the expectation operator.

Second term in equation (1)

By analogy with the manipulations above, the second term can be re-written as:

\displaystyle \displaystyle -\frac{1}{N^2}E\left[ \sum_{i=1}^N A^i\right]E\left[\sum_{j=1}^N B^j\right] = -\frac{1}{N^2} \sum_{i=1}^N E\left[A^i\right]\sum_{j=1}^N E\left[B^j\right] = -\frac{1}{N^2} \sum_{i=1}^N \sum_{j=1}^N E\left[A^i\right] E\left[B^j\right]

Combining the two terms

(2)

$\displaystyle \displaystyle \begin{aligned}\mathrm{Cov}\left(\bar{A}, \bar{B}\right) &= \frac{1}{N^2} \sum_{i=1}^N \sum_{j=1}^N \left( E\left[A^i B^j\right] - E\left[A^i\right] E\left[B^j\right]\right)\\
&= \frac{1}{N^2}\sum_{i=1}^N \sum_{j=1}^N \mathrm{Cov}\left(A^i, B^j\right)
\end{aligned}$

where the last line follows by the definition of covariance. I think that Eq. (2) is the formally-correct, general answer. But, practically speaking, how would I compute this from the data? Again, it would seem that I have no choice but to assume that the covariance between A and B is the same regardless of which data subset each value comes from, perhaps because the noise properties of all the subsets are the same. Therefore I can estimate it from the sample covariance

\displaystyle \sigma_{AB}

:

\displaystyle \displaystyle \mathrm{Cov}\left(A^i, B^j\right) \approx \sigma_{AB} \equiv \frac{1}{N-1} \sum_{i=1}^N \left(A^i - \bar{A}\right)\left(B^i - \bar{B}\right),~\forall~i,j

Substituting this into Eq. (2) produces:

(3)

\displaystyle \displaystyle \mathrm{Cov}\left(\bar{A}, \bar{B}\right) = \frac{1}{N^2}\sum_{i=1}^N \sum_{j=1}^N \sigma_{AB} = \frac{1}{N^2} \left(N^2\sigma_{AB}\right) = \sigma_{AB}

This is a very curious result. So to get the covariance of means, you don't divide the off-diagonal elements of the sample covariance matrix of the two vectors by anything??? At first I thought that this could make sense, because it's saying that if

\displaystyle A

and

\displaystyle B

are correlated, then averaging together several measurements of them doesn't reduce this correlation. But it seems like it cannot be correct for two reasons

In the case of a diagonal element, instead of an off-diagonal one, it doesn't reduce to the well-known result that $\displaystyle \mathrm{Cov}(\bar{A}, \bar{A}) = \mathrm{Var}(\bar{A}) \approx \sigma_A^2 /N$ . If I substitute $\displaystyle \sigma_A^2$ for $\displaystyle \sigma_{AB}$ in Eq. (3), the N-dependence will still cancel out, when it really shouldn't. There are fewer unique terms in the double sum of (2) in this instance, but I haven't been able to figure out if that remedies this problem.
It doesn't agree with a sim I ran that suggests that all elements of the sample covariance matrix of the two vectors are divided by $\displaystyle N$

Can anyone find a problem with the calculation steps or the assumptions above that would resolve this issue?

j-astron · Nov 17, 2018

Hmm, no responses so far. Well, a suggestion from a colleague of mine was that even if A and B have some non-zero covariance, the samples $\displaystyle A^i$ and $\displaystyle B^j$ act like independent random variables for $\displaystyle i \neq j$ , because of the following reasoning. Suppose the covariance is positive. Then, if A happens to scatter high (relative to the mean) in the ith realization, then B is more likely to have scattered high in that realization as well. But in the jth realization, it could very well be the case that both A and B happened to scatter low together. So, A and B samples won't necessarily correlate between different realizations, only within a given one. Therefore, perhaps Eq. 3 in my previous post should have had

\displaystyle \sigma_{AB}\delta_{ij}

substituted into it, rather than just

\displaystyle \sigma_{AB}

. Where the delta above is the Kronecker delta. Since all the terms in the double sum for which i is not equal to j drop off, we would end up with only N terms remaining, and Eq. 3 would yield the expected result of

\displaystyle (N\sigma_{AB})/N^2

.

Does anyone know how to express the idea in boldface a little more rigorously?

Covariance of the Sample Mean of Two Random Vectors

j-astron

Junior Member

j-astron

Junior Member