# Combining two standard errors for combined set of Red and Yellow flowers

#### scrapples

##### New member
[FONT=&quot]Hi all, I have some summary statistics from a data set that look something like this:

Category------Color---------Avg Growth-------Std Error
Flowers--------Red-------------1.1--------------1.5
Flowers--------Blue------------2.4--------------1.5
Flowers--------Yellow----------5.1--------------1.4
Flowers--------All Colors-------2.9--------------0.8

What I would love to do is calculate/estimate the average growth and standard error of the combined set of Red and Yellow flowers. Ideally, I'd have access to the underlying dataset so I could calculate these measures directly, but I don't have access to the raw data.

Is there a way I can estimate the combined Avg Growth and Std Error of the Red and Yellow flowers?

Thank you![/FONT]

#### tkhunny

##### Moderator
Staff member
Couple of items:

1) Simply averaging the Avg Growth may not be the right thing to do. One would need to have very similar sample sizes.
2) Adding Squared Errors might not be terrible, as long as you can ensure that the samples are independent.

Did "All Colors" come with the data? If it came with the data, I suppose you could attempt some decomposition of it to estimate the relative sizes of the three individual populations. Still, with only two constraints, you will have to guess, somewhere.

#### scrapples

##### New member
Couple of items:

1) Simply averaging the Avg Growth may not be the right thing to do. One would need to have very similar sample sizes.
2) Adding Squared Errors might not be terrible, as long as you can ensure that the samples are independent.

Did "All Colors" come with the data? If it came with the data, I suppose you could attempt some decomposition of it to estimate the relative sizes of the three individual populations. Still, with only two constraints, you will have to guess, somewhere.
Thanks! It's definitely tough without knowing population sizes, but the populations should be roughly the same size so I'm planning to assume the populations are the same size (not ideal but they should be close enough for my purposes).

Would it be reasonable to calculate the standard error of the combined Red and Yellow this way:

(avg of Red and Yellow std error) / (sq root of 2)

basically applying 1/sqrt where n is number of std error measurements?

#### tkhunny

##### Moderator
Staff member
No, that's a terrible estimate. It can result in something less than either of the two you started with. That's just far too presumptuous. If I had to, I'd just go with $$\displaystyle \sqrt{SE_{1}^{2} + SE_{2}^{2}}$$. In any case, it's just not clear at all how useful it will be.

#### scrapples

##### New member
No, that's a terrible estimate. It can result in something less than either of the two you started with. That's just far too presumptuous. If I had to, I'd just go with $$\displaystyle \sqrt{SE_{1}^{2} + SE_{2}^{2}}$$. In any case, it's just not clear at all how useful it will be.
But the standard error should decrease as the n size increases, right? If I'm estimating that I'm doubling the n size by combining the Red and Yellow observations, shouldn't the standard error of the combined dataset be less than the standard error of either the Red data set or the Yellow dataset?