# Probability of a winner? - prediction competition

#### mricon

##### New member
Hi there

I have something which has been bugging me a bit which I will try to explain in a VERY simple example...

Say I am going to roll a 6-sided dice. I ask a number of people to predict the number that will come out. They do this without knowing each others predictions. So, if there were three people and they chose separate numbers, there would be a 50% chance of one of them getting it right. However, if two of the three happen to pick the same number, only two of the six outcomes are 'covered' making it a 33% chance. If they all picked the same number (obviously less likely) then there would only be a one in six chance (16.67%)

So, the more predictions there are, the more likely it is that there will be a 'winner' however the more likely there will be repeat predictions.

Imagine this example scaled up to say 10,000 outcomes and 2,000 predictions - how likely is it there will be a winner?

It is not 20% (or very, very unlikely to be) because of the repeat predictions. So, what if there were 3,000 predictions, 4,000 predictions etc.

Even at 10,000 predictions what would it be? I expect it would be close to 90-100% but what would it be? If there were 20,000 predictions (or 100,000) how close would you get to 100%? I guess there may be confidence levels here.

Is there an online tool or model where you can put the variables in and it tells you the likelihood?

You'd need to know the variables of - (a) number of predictions (b) number of outcomes and (c) number of repeat predictions - the last one being the tricky one but I guess it could be calculated using probability in a model?

It is that last bit in C that is the thing making this harder, particularly as the number of entrants and outcomes affect it.

Hope that makes sense and happy to elaborate!

Thank you.

#### JeffM

##### Elite Member
Let p be the probability of a single person predicting event A correctly.

Therefore 1 - p is the probability of a single person predicting incorrectly.

If there are n people making independent predictions, the probability that all will be wrong is

$$\displaystyle (1 - p)^n.$$

So the probability that at least one prediction will be correct is

$$\displaystyle 1 - (1 - p)^n.$$

With three people and one 6-sided fair let's see how this works out.

$$\displaystyle 1 - \left (1 - \dfrac{1}{6} \right )^3 = 1 - \dfrac{125}{216} = \dfrac{216 - 125}{216} = \dfrac{91}{216}.$$

You can get the same answer a different way.

The probability that all three predict correctly $$\displaystyle \left ( \dfrac{1}{6} \right )^3 = \dfrac{1}{216}.$$

The probability that two predict correctly

$$\displaystyle 3 \cdot \left ( \dfrac{1}{6} \right )^2 \cdot \dfrac{5}{6} = \dfrac{15}{216}$$.

The probability that one predicts correctly

$$\displaystyle 3 \cdot \left ( \dfrac{5}{6} \right )^2 \cdot \dfrac{1}{6} = \dfrac{75}{216}.$$

Add those up and you should get 91/216.

#### mricon

##### New member
Thank you. How would that work with the larger numbers?

I mean by that, the principles remain but the number of possible outcomes goes up massively (to do those calculations on).

Partly why I wondered if anyone knew of a model or tool to do this?

#### JeffM

##### Elite Member
Thank you. How would that work with the larger numbers?

I mean by that, the principles remain but the number of possible outcomes goes up massively (to do those calculations on).

Partly why I wondered if anyone knew of a model or tool to do this?
That is why I gave you the first method.

you don't really specify what the underlying probabilities are in your 2000 and 10000 example.
If p = 1/10000 then the answer is

$$\displaystyle 1 - \left ( 1 - \dfrac{1}{10,000} \right )^{2000} \approx 18\%$$