In the game Diablo 2, there is a specific game item that appears in a chest with a known probability of 1/21,000. I was in a debate with a guy about finding such items systematically by running a certain dungeon again and again. In this specific dungeon, a chest that can contain the item appears 6 times per "run" if you follow a certain path. The guy I was debating insisted again and again that the odds of the item appearing in any given run is 1/3500 because you have 6 chances at 21,000 to 1 odds. This doesn't seem right to me. Since each 'chest opening' happens independently, and there are an infinite number of chests in the game, can the odds of finding this specific item really be reduced in such a way by opening multiple chests? By this logic, it could be said that opening 21,000 chests would have to reduce the odds to 1/1, and I am pretty sure this isn't right. I tried to research this topic online quite a bit, but I got bogged down reading about various distribution methods and I am pretty far removed from my college days at this point. I would like to form a coherent argument as well as a better understanding of how to solve this kind of problem. Can someone please help me figure this out?
TLDR: What is the real probability of success in 6 independent trials if each trial has 1/21,000 odds?
Your friend is wrong in principle. To see why, let's suppose the probability of success on one trial is 1/3 and you have 3 trials.
\(\displaystyle \text {Probability of:}\)
\(\displaystyle \text {0 successes } = \dfrac{2}{3} * \dfrac{2}{3} * \dfrac{2}{3} = \dfrac{8}{27}.\)
\(\displaystyle \text {1 success } = \left (\dfrac{1}{3} * \dfrac{2}{3} * \dfrac{2}{3} \right) + \left (\dfrac{2}{3} * \dfrac{1}{3} * \dfrac{2}{3} \right) + \left (\dfrac{1}{3} * \dfrac{2}{3} * \dfrac{2}{3} \right) = \dfrac{12}{27}.\)
\(\displaystyle \text {2 successes } = \left (\dfrac{2}{3} * \dfrac{1}{3} * \dfrac{1}{3} \right) + \left (\dfrac{1}{3} * \dfrac{2}{3} * \dfrac{1}{3} \right) + \left (\dfrac{1}{3} * \dfrac{1}{3} * \dfrac{2}{3} \right) = \dfrac{6}{27}.\)
\(\displaystyle \text {3 successes } = \dfrac{1}{3} * \dfrac{1}{3} * \dfrac{1}{3} = \dfrac{1}{27}.\)
In the case of exactly one success, that means you have two failures as well, and the success can occur on the first, second, or third trial. In the case of exactly two successes, that means you have one failure as well, and the failure can come on the first, second, or third trial.
Your friend's method will be way wrong in this case because he will go 3 times 1/3 = 1. The true answer of exactly one success is less than 1/2. And the probability of at least one success is just over 2/3, not 1.
It can be shown that there is a general formula, already mentioned by romsek, for m successes in n trials if the probability of success is p.
\(\displaystyle \text {Probability of } m \text { successes } = \dfrac{n!}{m! * (n - m)!} * p^m * (1 - p)^{(n-m)}.\)
Now we apply that formula to your problem.
\(\displaystyle \text {Probability of } 1 \text { success } =\)
\(\displaystyle \dfrac{6!}{1! * (6 - 1)!} * p^m * (1 - p)^{(n-m)} =\)
\(\displaystyle \dfrac{6!}{5!} * \left ( \dfrac{1}{21000} \right )^1 * \left ( \dfrac{20999}{21000} \right)^5 = \dfrac{6}{21000} * \left ( \dfrac{20999}{21000} \right)^5.\)
Obviously \(\displaystyle \dfrac{20999}{21000} < 1 \implies \left ( \dfrac{20999}{21000} \right)^5 < 1^5 = 1.\)
\(\displaystyle \text {So the probability of } 1 \text { success } < \dfrac{6}{21000}.\)
\(\displaystyle \text {However, } 0. 9996 < \left ( \dfrac{20999}{21000} \right)^5 < 1 \implies \left ( \dfrac{20999}{21000} \right)^5 \approx 1 \implies\)
\(\displaystyle \text {Probability of } 1 \text { success } \approx \dfrac{6}{21000} = \dfrac{1}{3500}.\)
So, given the very low probability you and your friend are dealing with, your friend has a good approximation, but it is an overestimate.
After reading this, you may wonder whether that is even the right question. The better question is what is the probability of getting one or more of the items.
As romsek's calculations show
\(\displaystyle 1 - \left ( \dfrac{20999}{21000} \right)^6 \approx 0.00028568.\)
\(\displaystyle \dfrac{1}{3500} \approx 0.0002571.\)
So again your friend has a good approximation, but an underestimate.
His method will give a decent approximation if both the probability of success and the number of trials is low. It gives very bad estimates if those conditions are not met.