Averaging Two Separate Sets of Data

Shaenimus

New member
Joined
Jan 16, 2020
Messages
2
Hello,

I work for a transportation company that uses a GPS device in their vehicles to track their driver's driving habits. A point is assigned to the driver if they: brake too harshly, accelerate too quickly, turn too harshly, speed, etc. Currently those points are tracked as a percentage by Points Per Driving Hour. However, some drivers drive longer distances, but accrue less Driving Hours so they feel that the current formula is unfair.

Is there a way to combine each driver's Points per Hour and Points per Mile into a single score?
 
Example:

Distribution Center 1 Has 10 Drivers and drove a combined 12,000 miles in 350 Hours in one month. They received 4 points that month and so averaged 0.01 Points Per hour.

Distribution Center 2 Has 2 Drivers and drove a combined 3,600 miles in 50 hours in the same month. They received 1 point and averaged 0.02 Points Per Hour.

DC 1 averaged 1,200 Miles per Driver

DC 2 Averaged 1,800 Miles per Driver and think they should be scored based off of miles, not hours driven.

How can they be more fairly scored? Average Points per Hour with Points per Mile?
 
The correct answer would come from an educating weighting of the two statistics, points per mile, and points per hour.

You all are the ones collecting the data and thus should have some idea of what it is you're trying to evaluate.

Clearly miles and hours are correlated but if you are trying to assess say habits at high speeds vs low speeds then looking at these averages separately might help.

When you use the word "drove" does that mean the entire time taken for a trip or does that just include actual time behind the wheel of a generally moving vehicle? If the former then there are going to generally be more breaks during the longer trips where there are no data on driving at all.

I'd say there's a geographic component to all this as well. Trucks on the highways in the midwest will rarely have to have any control inputs made whereas trucks snarled in rush hour traffic in the northeast will have to have near constant control inputs. There is a time of day aspect as well.

This all leads to a multidimensional statistics problem which is probably more complicated than you initially anticipated. It can certainly be done, but if you're intending on penalizing drivers based on this data it should be done properly.
 
Top