Is my solution correct? (We have a large number of items n>=10000 that we want to divide into one of four boxes.)

wiizz

New member
Joined
Nov 1, 2023
Messages
1
Problem: We have a large number of items n>=10000 that we want to divide into one of four boxes. For each item, we independently and uniformly at random assign that item to one of the four boxes. Ideally, after this is done, each box would contain roughly 1/4 of the items, but we are not sure how many items each box contains after randomly assigning them. The claim is that with probability at least 99% all the boxes get between 23% - 27% of the items. Prove or disprove the claim.

Answer: this is how I've thought about solving the problem, using the Chernoff Bound:
We have X is the average of K independent random variables X1,...,Xk.
We can define Xi = 1 if item i is assigned to box 1 for example
if item i is not assigned to box 1, we define Xi = 0
The probability that an item is placed into one of the boxes is 1/4, so E[X]=1/4 (by linearity of expectation). Now the claim can be written as
Pr[E[X] − 0.02 <= X <= E[X] +0.02] >= 0.99
The probability that the claim isn't true is the probability that X < 0.23 OR X >0.27, and according to the Chernoff bound this should be <=2e^(2k*epsilon^2)
where k = 10000 and epsilon=0.02
This value came out to be less than 0.01, which is the required confidence level so the claim must be true then.

I've left out some details but this is the general answer, is there an obvious error/flaw that I am missing or is this solution correct?
 
I asked my computer to compute the probability that [imath]2300 \leq \sum_{k=1}^{10000} X_i \leq 2700[/imath], and it returned [imath]1 - 3.68\cdot 10^6[/imath].
 
Top