Combining two standard errors for combined set of Red and Yellow flowers

scrapples · Jan 29, 2018

[FONT=&quot]Hi all, I have some summary statistics from a data set that look something like this:

Category------Color---------Avg Growth-------Std Error
Flowers--------Red-------------1.1--------------1.5
Flowers--------Blue------------2.4--------------1.5
Flowers--------Yellow----------5.1--------------1.4
Flowers--------All Colors-------2.9--------------0.8

What I would love to do is calculate/estimate the average growth and standard error of the combined set of Red and Yellow flowers. Ideally, I'd have access to the underlying dataset so I could calculate these measures directly, but I don't have access to the raw data.

Is there a way I can estimate the combined Avg Growth and Std Error of the Red and Yellow flowers?

Any advice is appreciated.

Thank you![/FONT]

tkhunny · Jan 29, 2018

Couple of items:

1) Simply averaging the Avg Growth may not be the right thing to do. One would need to have very similar sample sizes.
2) Adding Squared Errors might not be terrible, as long as you can ensure that the samples are independent.

Did "All Colors" come with the data? If it came with the data, I suppose you could attempt some decomposition of it to estimate the relative sizes of the three individual populations. Still, with only two constraints, you will have to guess, somewhere.

scrapples · Jan 30, 2018

tkhunny said:
Couple of items:

1) Simply averaging the Avg Growth may not be the right thing to do. One would need to have very similar sample sizes.
2) Adding Squared Errors might not be terrible, as long as you can ensure that the samples are independent.

Did "All Colors" come with the data? If it came with the data, I suppose you could attempt some decomposition of it to estimate the relative sizes of the three individual populations. Still, with only two constraints, you will have to guess, somewhere.

Thanks! It's definitely tough without knowing population sizes, but the populations should be roughly the same size so I'm planning to assume the populations are the same size (not ideal but they should be close enough for my purposes).

Would it be reasonable to calculate the standard error of the combined Red and Yellow this way:

(avg of Red and Yellow std error) / (sq root of 2)

basically applying 1/sqrt(n) where n is number of std error measurements?

tkhunny · Jan 30, 2018

No, that's a terrible estimate. It can result in something less than either of the two you started with. That's just far too presumptuous. If I had to, I'd just go with \(\displaystyle \sqrt{SE_{1}^{2} + SE_{2}^{2}}\). In any case, it's just not clear at all how useful it will be.

scrapples · Feb 5, 2018

tkhunny said:
No, that's a terrible estimate. It can result in something less than either of the two you started with. That's just far too presumptuous. If I had to, I'd just go with \(\displaystyle \sqrt{SE_{1}^{2} + SE_{2}^{2}}\). In any case, it's just not clear at all how useful it will be.

But the standard error should decrease as the n size increases, right? If I'm estimating that I'm doubling the n size by combining the Red and Yellow observations, shouldn't the standard error of the combined dataset be less than the standard error of either the Red data set or the Yellow dataset?

Combining two standard errors for combined set of Red and Yellow flowers

scrapples

New member

tkhunny

Moderator

scrapples

New member

tkhunny

Moderator

scrapples

New member