Question about combining results of multiple sets of data

LiquidSapphire · Jan 31, 2013

Percentiles of Percentiles -Question about combining results of multiple sets of data

Hello -

I am running a retirement simulation and wanted to run my statistics logic by someone smarter than me to see if this holds water.

I am using the calculator at firecalc.com - it is a retirement calculator. I am trying to find a certain annual spending level that will be successful 80% of the time (success = you don't run out of money) given a certain set of facts such as mean return, standard deviation of return, etc. The calculator can do this very well, the problem is that it will only run 181 trials at once, and every time I run it I get some significant variance.

I want to run many many more trials than 181, like say, 8000 trials. So what I did was I ran the calculator 45 times to get the number which, in the 181 trials it ran, was successful 80% of the time.

So now I have this set of 45 numbers that were successful 80% of the time in their respective trial. In other words, I ran 181 experiments 45 different times, and now I have a data set of 45 numbers where in each of those 45 times, I did not run out of money 80% of the time. I want to distill this down to 1 number instead of 45 numbers. I want to know what the 80% success rate is if one were to run 8000 trials. Do I take the average of them, or do I take the 80th percentile of them, or neither? Is there a way I can calculate the 80% success rate of 8000 trials by running 181 trials 45 times and looking at the 80% success rate of those mini trials?

If it makes a difference, I kept the parameters in the calculator the same every time, I only hit the "submit" button 45 times so I could see the result of the 181 trials 45 different times.

DrPhil · Feb 5, 2013

LiquidSapphire said:
I am trying to find a certain annual spending level that will be successful 80% of the time (success = you don't run out of money) given a certain set of facts such as mean return, standard deviation of return, etc. The calculator can do this very well, the problem is that it will only run 181 trials at once, and every time I run it I get some significant variance.

I want to run many many more trials than 181, like say, 8000 trials. So what I did was I ran the calculator 45 times to get the number which, in the 181 trials it ran, was successful 80% of the time.

So now I have this set of 45 numbers that were successful 80% of the time in their respective trial. In other words, I ran 181 experiments 45 different times, and now I have a data set of 45 numbers where in each of those 45 times, I did not run out of money 80% of the time. I want to distill this down to 1 number instead of 45 numbers. I want to know what the 80% success rate is if one were to run 8000 trials. Do I take the average of them, or do I take the 80th percentile of them, or neither? Is there a way I can calculate the 80% success rate of 8000 trials by running 181 trials 45 times and looking at the 80% success rate of those mini trials?

We can assume that there is a "population distribution" that would be the result of running an infinite number of trials. That distribution has both a mean (average) and also a standard deviation which is a measure of dispersion about the mean. Those two statistics are denoted by Greek letters \(\displaystyle \mu\) and \(\displaystyle \sigma\) (mu and sigma). When you run an experiment of 181 trials, the distribution of results will be a "normal distribution" (bell curve) with mean = \(\displaystyle \mu\) and standard deviation reduced from the population by dividing by the square root of the number of trials, \(\displaystyle \sigma/ \sqrt{181}\). [BTW - we don't have to know the shape of the population distribution - the sample distribution is "always" normal.]

From the distribution of your 45 experiments, find the mean and the standard deviation, and use those to get estimators for the population statistics. Let \(\displaystyle x_i\) represent one of the results, namely, the value of spending level that has 80% success. The distribution is then the probability of an 80% success rate as a function of spending level.

Mean: \(\displaystyle \displaystyle \; \; \; \mu = \frac{1}{45}\ \sum_{i=1}^{45} x_i \)

StdDev: \(\displaystyle \displaystyle \; \sigma = \sqrt{ 181\ \left[ \frac{1}{45} \sum_{i=1}^{45} (x_i^2) - \mu ^2 \right] } \)

The standard deviation is the square root of the "Variance," and Variance is the mean of the square minus the square of the mean. The factor of \(\displaystyle \sqrt{181}\) in the standard deviation takes you back to the population distribution - i.e., the estimated result of a single model.

The short answer is "Yes, take the average of the 48 experiments." However, just knowing the mean of the distribution is not enough to base decisions on - you have to know how broad the distribution is.

LiquidSapphire · Feb 9, 2013

Thank you so much for your thorough and thought out answer. I knew it was more complicated than simply taking the average!!

I will definitely take this information back to the data and try it out. Thanks again, I really appreciate your answer and the explanation behind it.

Question about combining results of multiple sets of data

LiquidSapphire

New member

DrPhil

Senior Member

LiquidSapphire

New member