Need a function that compares variation of a sample that is uniform regardless of numerical size of samples

Thadriel

New member
Joined
Feb 25, 2022
Messages
20
My first thought was standard deviation. But that doesn't work because a population of 10 three digit numbers will have a standard deviation that is of a different order of magnitude than a population of 10 six digit numbers.

For example: 100, 150, 200, 250, 300, 350, 400, 450, 500, 550 has a standard deviation of 151.38

While, 100000, 100500, 200000, 200500, 300000, 300500, 400000, 400500, 500000, 500500 has a standard deviation of 149,071.

So I'm looking for a function that will compare their RELATIVE variation. I guess something that normalizes based on order of magnitude.

Is there any such function out there or do I have to do the work and create it myself?
 
I guess I could just divide standard deviation by the mean, could I not? Does that have a name?
 
Actually, before I'm satisfied with this, I don't think the standard deviation will always have the same number of digits as the mean. If there are cases where it would not, then whatever this is would not satisfy what I'm after.
 
My first thought was standard deviation. But that doesn't work because a population of 10 three digit numbers will have a standard deviation that is of a different order of magnitude than a population of 10 six digit numbers.

For example: 100, 150, 200, 250, 300, 350, 400, 450, 500, 550 has a standard deviation of 151.38

While, 100000, 100500, 200000, 200500, 300000, 300500, 400000, 400500, 500000, 500500 has a standard deviation of 149,071.

So I'm looking for a function that will compare their RELATIVE variation. I guess something that normalizes based on order of magnitude.

Is there any such function out there or do I have to do the work and create it myself?
If you want to be statistically accurate, we often use the F-test.
I guess I could just divide standard deviation by the mean, could I not? Does that have a name?
Yes, it's called the coefficient of variation.
 
Thanks. So the problem with coefficient of variation is that it still depends on standard deviation, which can vary wildly between different samples we might want to compare. I need something that's going to be normalized between 0 and 1 no matter what data samples are used. Is there such a thing?

I'll read up on the F-test before I say anything else, but before I do, could I not use mean absolute deviation, then divide by the mean?

Let me test with these numbers (I chose the central m(X) as just the mean:

For the three digit numbers, it's 125/325 = 0.385

For the six digit numbers, it's 0.3998.


It looks like this will work.
 
Thanks. So the problem with coefficient of variation is that it still depends on standard deviation, which can vary wildly between different samples we might want to compare. I need something that's going to be normalized between 0 and 1 no matter what data samples are used. Is there such a thing?

I'll read up on the F-test before I say anything else, but before I do, could I not use mean absolute deviation, then divide by the mean?

Let me test with these numbers (I chose the central m(X) as just the mean:

For the three digit numbers, it's 125/325 = 0.385

For the six digit numbers, it's 0.3998.


It looks like this will work.
I think the coefficient of variation should be sufficient for your purpose since your data is in an increasing arithmetic sequence (as if they're man-made).
Caution: The F-test for equality of two variances is very sensitive to deviations from normality. If the two distributions are not normal, or close, the test can give a biased result for the test statistic.
 
Top