Samples quantity calculation

Neto58

New member
Joined
Jan 2, 2022
Messages
4
Hi, I'm stuck in a real problem, it's not a home work question. My probability calculation background is unfortunately not strong.
We have an universe of 30.000 item types.
And we have samples with 500 unique random items.
How many such samples should we collect so that all together we get 99% of those 30.000 item types? It's a number for sure above 60.
This number will be used in my current plan phase, to estimate efforts.

Also, how we could refine the calculus if we consider that item pick up is not flatly random, but has a uniform distribution with weights between 1 and 2? This consideration will certainly increase the number of necessary samples, but how much?
Thank you in advance.
 
I am guessing that 30.000 is in European, not the US, convention; i.e. 30.000 is the same as 30*1000. In this case 99% will be 29700 item types -- no way to get that from 500 items. Unless I am missing something important.
 
If I interpreted the question correctly, the universe has a population of 30,000 items but only 500 unique types. Meaning 30,000 contains duplicates of the 500 types. You are looking for the number of samples are needed such that you’ll have 99% of the 500 unique types i.e 495, or you’re looking the the number of samples needed such that you’re 99% confident you collected all 500 types?
 
I am guessing that 30.000 is in European, not the US, convention; i.e. 30.000 is the same as 30*1000. In this case 99% will be 29700 item types -- no way to get that from 500 items. Unless I am missing something important.
Yes, there are 30 thousands item types.
I'm afraid my explanation was not clear, let me try again. What I am looking for is how many samples I will need to collect 29700 item types, where each sample has 500 unique items.
 
If I interpreted the question correctly, the universe has a population of 30,000 items but only 500 unique types. Meaning 30,000 contains duplicates of the 500 types. You are looking for the number of samples are needed such that you’ll have 99% of the 500 unique types i.e 495, or you’re looking the the number of samples needed such that you’re 99% confident you collected all 500 types?
Let me rewrite my question. The universe has a population of 30 thousands unique item types.
On the other hand, I have groups of 500 unique items, which I named as samples.
My question is: how many such groups I will need to collect in order to have at least 29700 different items in my collection.
 
Let me rewrite my question. The universe has a population of 30 thousands unique item types.
On the other hand, I have groups of 500 unique items, which I named as samples.
My question is: how many such groups I will need to collect in order to have at least 29700 different items in my collection.
It doesn't sound like a probability/statistic question, more like an algebra. Assuming sampling without replacement. If the population is 30 thousand, each of your samples contains 500 observations and you need at least 29,700. Then, 29,700/500=59.4 samples. Since you can't sample a fraction, rounds up to 60. So you need at least 60 samples to obtain at least 29,700.
 
It doesn't sound like a probability/statistic question, more like an algebra. Assuming sampling without replacement. If the population is 30 thousand, each of your samples contains 500 observations and you need at least 29,700. Then, 29,700/500=59.4 samples. Since you can't sample a fraction, rounds up to 60. So you need at least 60 samples to obtain at least 29,700.
The problem is that different sets of 500 items may contain repetitions. So, for instance, two particular sets of items may contain 990 unique items, not 1000, because 10 items shows up on both sets. That's why I filed the question on a Probability thread.
 
The problem is that different sets of 500 items may contain repetitions. So, for instance, two particular sets of items may contain 990 unique items, not 1000, because 10 items shows up on both sets. That's why I filed the question on a Probability thread.
In other words, you're looking for the number of times you need to sample such that you are guaranteed to have at least 29700 items, and each sample is unique i.e. no sample contains the same item type.
 
Last edited:
Top