Need a function that compares variation of a sample that is uniform regardless of numerical size of samples

Thadriel · Mar 31, 2022

My first thought was standard deviation. But that doesn't work because a population of 10 three digit numbers will have a standard deviation that is of a different order of magnitude than a population of 10 six digit numbers.

For example: 100, 150, 200, 250, 300, 350, 400, 450, 500, 550 has a standard deviation of 151.38

While, 100000, 100500, 200000, 200500, 300000, 300500, 400000, 400500, 500000, 500500 has a standard deviation of 149,071.

So I'm looking for a function that will compare their RELATIVE variation. I guess something that normalizes based on order of magnitude.

Is there any such function out there or do I have to do the work and create it myself?

Thadriel · Mar 31, 2022

I guess I could just divide standard deviation by the mean, could I not? Does that have a name?

Thadriel · Mar 31, 2022

Actually, before I'm satisfied with this, I don't think the standard deviation will always have the same number of digits as the mean. If there are cases where it would not, then whatever this is would not satisfy what I'm after.

BigBeachBanana · Mar 31, 2022

Thadriel said:
My first thought was standard deviation. But that doesn't work because a population of 10 three digit numbers will have a standard deviation that is of a different order of magnitude than a population of 10 six digit numbers.

For example: 100, 150, 200, 250, 300, 350, 400, 450, 500, 550 has a standard deviation of 151.38

While, 100000, 100500, 200000, 200500, 300000, 300500, 400000, 400500, 500000, 500500 has a standard deviation of 149,071.

So I'm looking for a function that will compare their RELATIVE variation. I guess something that normalizes based on order of magnitude.

Is there any such function out there or do I have to do the work and create it myself?

If you want to be statistically accurate, we often use the F-test.

Thadriel said:
I guess I could just divide standard deviation by the mean, could I not? Does that have a name?

Yes, it's called the coefficient of variation.

Thadriel · Mar 31, 2022

Thanks. So the problem with coefficient of variation is that it still depends on standard deviation, which can vary wildly between different samples we might want to compare. I need something that's going to be normalized between 0 and 1 no matter what data samples are used. Is there such a thing?

I'll read up on the F-test before I say anything else, but before I do, could I not use mean absolute deviation, then divide by the mean?

Let me test with these numbers (I chose the central m(X) as just the mean:

For the three digit numbers, it's 125/325 = 0.385

For the six digit numbers, it's 0.3998.

It looks like this will work.

BigBeachBanana · Mar 31, 2022

Thadriel said:
Thanks. So the problem with coefficient of variation is that it still depends on standard deviation, which can vary wildly between different samples we might want to compare. I need something that's going to be normalized between 0 and 1 no matter what data samples are used. Is there such a thing?

I'll read up on the F-test before I say anything else, but before I do, could I not use mean absolute deviation, then divide by the mean?

Let me test with these numbers (I chose the central m(X) as just the mean:

For the three digit numbers, it's 125/325 = 0.385

For the six digit numbers, it's 0.3998.

It looks like this will work.

I think the coefficient of variation should be sufficient for your purpose since your data is in an increasing arithmetic sequence (as if they're man-made).
Caution: The F-test for equality of two variances is very sensitive to deviations from normality. If the two distributions are not normal, or close, the test can give a biased result for the test statistic.

Need a function that compares variation of a sample that is uniform regardless of numerical size of samples

Thadriel

New member

Thadriel

New member

Thadriel

New member

BigBeachBanana

Senior Member

Thadriel

New member

BigBeachBanana

Senior Member