Harmonic mean use case

mcstatz5829

New member
Joined
Apr 23, 2020
Messages
8
BeforeAfterIncrease% Increase
59.61.9292%
69.841.6464%
78.821.2626%
812.561.5757%

Let's say I have a set of data and I think that a harmonic mean best describes the average of the data. Let's say I'm analyzing mph and I don't really care about the total distance, so harmonic mean makes sense.

I implement a trial and the data changes.

Question 1: is it fair to take the harmonic mean of the changes to report the average change?

Question 2: The harmonic mean of column 3 is 1.56. The harmonic mean of column 4 is 48%. Which is the more accurate representation?
 
Sorry, but you made me laugh out loud with "Question 1". Nothing personal, but you are not listening to the words forming in your head. If you want the "average change" you have to calculate the "average change". Why would an harmonic mean do that? It makes no sense at all.

The average change takes ALL the data from EVERY individual calculation. There is NO shortcut.

Think about a batting average.

Player #1 is hitting 1.000. This player is 1 for 1.
Player #2 is hitting 0.285. This player is 116 for 407.
Just a little thought suggests that the mean batting average for these two players is not very far from the individual batting average of Player #2.

So, your harmonic mean of batting averages is1.0000.285=0.534\sqrt{1.000\cdot 0.285} = 0.534.

This has NOTHING to do with the correct total batting average: 1+1161+407=117408=0.287\dfrac{1 + 116}{1 + 407} = \dfrac{117}{408} = 0.287
If you don't have all the data, you must weight them VERY CAREFULLY.

If you calculate the proper average increase, 5+6+7+89.6+9.84+8.82+12.56=0.637\dfrac{5+6+7+8}{9.6+9.84+8.82+12.56} = 0.637, you might THINK your 1.62 is closer than your 48%. Unfortunately, this is nothing but coincidence. It is not a reliable observation. It will not serve you in the long run. Frankly, it may not serve you in ANY other calculation - ever.

Please rethink whatever it is you are doing. There might be an harmonic mean in there, somewhere, but we haven't seen it, yet.
 
No need to belittle me. I already have all the underlying data. I know exactly what the weighted averages before and after. I just don’t care about the weightings. They are not useful.

let’s take your batting scenario. I’m a coach and I design a new training program. I implement it and batter A has an improvement after 50 at bats.

So now I try batter B. Improvement over 50 at bats. At this point, we now have 100 at bats for batter A.

Now when I am projecting the improvement for batter C, should I take into account that A is a bigger sample size than B? I don’t think that’s an easy answer. If A improved and regressed, or continued to improve, I would argue the samples should be weighted. If both are step changes, then I would argue weighting isn’t useful.

back to my actual data set, I am measuring productivity increases for workers. If I weight it, it artificially pushes the data towards regions that have higher initial productivity and higher head counts, which is not useful to me at all.
 
BeforeAfterIncrease% Increase
59.61.9292%
69.841.6464%
78.821.2626%
812.561.5757%

Let's say I have a set of data and I think that a harmonic mean best describes the average of the data. Let's say I'm analyzing mph and I don't really care about the total distance, so harmonic mean makes sense.

I implement a trial and the data changes.

Question 1: is it fair to take the harmonic mean of the changes to report the average change?

Question 2: The harmonic mean of column 3 is 1.56. The harmonic mean of column 4 is 48%. Which is the more accurate representation?
I think tkhunny missed your point, so let's start over.

I think you need to explain the details a little better, so we can follow your reasoning. For example, you say here that you are "analyzing mph"; but evidently that isn't what you are really doing, so that's confusing. And what you call "increase" here is apparently not what I would call an increase (after - before), but a ratio. Then you talk about "the average change"; which of these numbers are you referring to?

You are right that in some things you can do involving speeds, the harmonic mean is useful. But even if we just accept that you know what you are talking about, and that is appropriate for your problem, to what specifically does the harmonic mean apply? Am I to assume you mean that the harmonic mean of 5, 6, 7, 8 in your first column is an appropriate average to use in combining them? It will be much clearer if you state such things explicitly.

Getting past that, I think your question is, if it is appropriate to use the harmonic mean when combining some data Xi, does that imply that the harmonic mean of the percent increases of the Xi, or of the ratios, is also appropriate?

I think the answer is No, largely because change involves subtraction, and harmonic means involve reciprocals, and subtraction doesn't "play well" with reciprocals (or division in general). In fact, the answer would be No if you were talking about arithmetic means, for a similar reason. But I'd be much more certain if I knew what these numbers really mean, and what you are doing with the "average". Then I could actually try out a calculation to demonstrate the effect of averaging.
 
I think tkhunny missed your point, so let's start over.

I think you need to explain the details a little better, so we can follow your reasoning. For example, you say here that you are "analyzing mph"; but evidently that isn't what you are really doing, so that's confusing. And what you call "increase" here is apparently not what I would call an increase (after - before), but a ratio. Then you talk about "the average change"; which of these numbers are you referring to?

You are right that in some things you can do involving speeds, the harmonic mean is useful. But even if we just accept that you know what you are talking about, and that is appropriate for your problem, to what specifically does the harmonic mean apply? Am I to assume you mean that the harmonic mean of 5, 6, 7, 8 in your first column is an appropriate average to use in combining them? It will be much clearer if you state such things explicitly.

Getting past that, I think your question is, if it is appropriate to use the harmonic mean when combining some data Xi, does that imply that the harmonic mean of the percent increases of the Xi, or of the ratios, is also appropriate?

I think the answer is No, largely because change involves subtraction, and harmonic means involve reciprocals, and subtraction doesn't "play well" with reciprocals (or division in general). In fact, the answer would be No if you were talking about arithmetic means, for a similar reason. But I'd be much more certain if I knew what these numbers really mean, and what you are doing with the "average". Then I could actually try out a calculation to demonstrate the effect of averaging.

I used mph as an example, but for the actual use case it's "work completed/hour".

The situation is we piloted a new training program in dozens of our markets. Most saw an increase in "work completed/hour", enough to justify expanding the investment elsewhere. The question I am trying to answer, is what impact should we plan for the other areas we are expanding to?
i.e. work/hours * (1+x%) = future work/hours. I'm trying to determine a reasonable x%.

It would be trivial to do something like (work1 + work2 + work3)/(hours1 + hours2 + hours3) for before and after and take the difference, but there are several reasons why I think that's not appropriate:
  1. a unit of work in one market is not the same as a unit of work in another market, and normalizing them is an exercise full of inexact assumptions using fuzzy data
  2. work amounts in some areas are orders of magnitudes higher than work amounts in others, for reasons I believe are unrelated the underlying productivity increase. I don't think it makes sense to weight the results towards those areas, which may present an overly optimistic projection. Reasons for the different number of work amounts:
    1. attrition levels
    2. size of market
    3. duration of pilot
  3. most of the data points contain thousands of work units and thousands of hours, so I feel the risk of an outlier due to small sampling is low
I'm tempted to take the median improvement, but since we're talking rates and we don't really care what the size of the numerator is, I was considering using a harmonic mean instead (the analogy in my head is if I'm averaging the speeds for different cars, and I don't care how far they drove).

Hopefully that is clear what I am trying to achieve.


Side note, the Increase and % Increase...

Original value * 1.2 is equivalent to a 20% increase.

So if productivity is 1 and goes to 1.2, I can report that as 1.2 times original, or 20% increase. But only one of those numbers is guaranteed to be positive, which is required for harmonic means. But even when they are all positive, the calculation results in a different mean.
 
First, let's think about harmonic means; then we can start to think about what is really applicable for your situation.

The place where harmonic mean is relevant to speeds is to find the average speed of a car if it has gone at different speeds over equal distances on a trip. The average is defined as the total distance divided by total time; each individual time is the fixed lap distance divided by speed, so the average speed is the harmonic mean of the individual speeds. This is a very specific situation, with equal distances, and does not mean that any time you are averaging rates, you should use the harmonic mean.

The parallel in your case would be, perhaps, if in each market they did the same amount of total work. That's not true.

Note also that your (work1 + work2 + work3)/(hours1 + hours2 + hours3) is the definition of what the "appropriate average" must be, if we are talking about one entity moving/working at different rates sequentially; it's the constant rate at which the same thing would be accomplished in the same time. The harmonic mean is just a shortcut to that, in a specific situation. (This is also the justification for weighted averages.)

So the first question is, are you really talking about one thing being accomplished in multiple parts? I guess so, but I'd have to think more. In any case, the way to determine how something should be averaged is to consider how the individuals combine to make the whole. Arithmetic means are used when things add up (e.g. successive increases); geometric means when they multiply (e.g. successive ratios); harmonic means when reciprocals add (e.g. speeds with constant distance).

Next, we have to think about how to predict increases in speed, if it's even possible. I'll quit for now, and maybe come back into the discussion later if I have good ideas. (This is getting too close to real-life business issues, which others may have more understanding of.)
 
First, let's think about harmonic means; then we can start to think about what is really applicable for your situation.

The place where harmonic mean is relevant to speeds is to find the average speed of a car if it has gone at different speeds over equal distances on a trip. The average is defined as the total distance divided by total time; each individual time is the fixed lap distance divided by speed, so the average speed is the harmonic mean of the individual speeds. This is a very specific situation, with equal distances, and does not mean that any time you are averaging rates, you should use the harmonic mean.

The parallel in your case would be, perhaps, if in each market they did the same amount of total work. That's not true.

Note also that your (work1 + work2 + work3)/(hours1 + hours2 + hours3) is the definition of what the "appropriate average" must be, if we are talking about one entity moving/working at different rates sequentially; it's the constant rate at which the same thing would be accomplished in the same time. The harmonic mean is just a shortcut to that, in a specific situation. (This is also the justification for weighted averages.)

So the first question is, are you really talking about one thing being accomplished in multiple parts? I guess so, but I'd have to think more. In any case, the way to determine how something should be averaged is to consider how the individuals combine to make the whole. Arithmetic means are used when things add up (e.g. successive increases); geometric means when they multiply (e.g. successive ratios); harmonic means when reciprocals add (e.g. speeds with constant distance).

Next, we have to think about how to predict increases in speed, if it's even possible. I'll quit for now, and maybe come back into the discussion later if I have good ideas. (This is getting too close to real-life business issues, which others may have more understanding of.)

I bolded a section above. We are most certainly not talking about one entity. We are talking about different business units in different markets doing different work that cannot be converted to a common unit of work.

As I said, I do not care about the scale of the work completed or the hours worked for any given sample, other than the sample is of sufficient size. I only care about the rate.

As an example, if I hire 100 new workers in the Vancouver market, and they perform 50% better than the new hires the prior year in the same market. And I hire 10 new workers in the Winnipeg market, and they perform 5% better than the new hire the prior year in the same market. Do I weight the average towards the Calgary market? If the fact is there are 10x the number of new hires because our pay rate is below market value and we churn through employees like underwear, it makes no sense to weight that sample higher! Add in the fact that the work they are doing can not be added in any meaningful way. I have no reason to believe that 50% is a more realistic expectation of future performance of the new training than 5%.

I will use the median improvement for now.
 
I think you got my point: your situation does not fit my "ifs" for applying the harmonic mean, and is not even something "combined" so that any kind of average makes sense, scaled or not.

What you are really doing is trying to predict the results of a program that shows wild variation. In order to make a valid prediction for a new market, you'd probably need to analyze the cause of the variation; or if that can't be done, perhaps examine the distribution of results in some way. (That would be an area of statistics I claim no knowledge of.)

Here's the best I can come up with: If you suppose (contrary to your expectation, but you can't predict without assuming that something is constant!) that people everywhere respond individually more or less the same, then in your example, in effect you have 100 people whose work output was 150% of the norm, and 10 who produced 105%, so the average person in that group would do (100*1.5 + 10*1.05)/110 = 146%, so you would expect about 46% improvement on the average in another group who behave about the same. That, of course, is a weighted average; and the reason for the weight is not that the first group are more important, but just that more of the data come from there.

The only way I see to avoid side-effects of such weighting would be to take into account specific causes. For example, if there are so many hires in the first group because they come and go quickly, then you might want to weight the data based not on the number of people, but on, say, the number of person-days they represent, or something. And if the work people do is different, you might need to somehow measure it on some scale that does make it comparable.

But ultimately, if there isn't enough data (or time) to analyze causes deeply, then I suppose a median is better than nothing. But it wouldn't be a highly reliable prediction.
 
Taking the weighted average doesn't always even give a sensible result. I've seen this situation happen twice in five years working for my company:

First set of data is Market A. Second set of Data is Market B. Results are through the first three months of the new hires' careers.

We have four different cohorts, Market A Feb, Market A Mar, Market B Feb, and Market B Mar. Respectively, the cohorts improved 3.3%, 6.7%, 3.1%, and 3.1% over their baselines for their markets. But when you add all the numbers together, as a group the improvement was only 1.2%!

This is the reverse of taking the shortest guy from a basketball team and putting him in a kindergarten class. Both groups increased their average height, but in reality nothing changed.

The analogy in the data below: both groups actually grew taller, but the short one increased in number as well.

Hire DateHoursWorkWork per HourVariance to Baseline
2019 (Baseline)
1,000​
1,500​
1.50​
0.0%​
Feb 1, 2020
400​
620​
1.55​
3.3%​
Mar 1, 2020
300​
480​
1.60​
6.7%​
Feb + Mar, 2020
700​
1,100​
1.57​
4.8%​
Hire DateHoursWorkWork per HourVariance to Baseline
2019 (Baseline)
3,000​
3,200​
1.07​
0.0%​
Feb 1, 2020
1700​
1,870​
1.10​
3.1%​
Mar 1, 2020
1300​
1,430​
1.10​
3.1%​
Feb + Mar, 2020
3,000​
3,300​
1.10​
3.1%​
2019 (Baseline)
4,000​
4,700​
1.18​
0.0%​
Feb + Mar, 2020
3,700​
4,400​
1.19​
1.2%​
 
The Harmonic Mean (for real this time) is a biased weighted average.

Are you sure that sort of thing will provide any more rational a result than any other variation on that same theme?
 
Top