probability distribution

irum · Jul 5, 2013

Suppose there are 1 million parts which have 1% defective parts i.e 1 million parts have 10000 defective parts. Now suppose we are taking different sample sizes from 1 million like 10%, 30%, 50%, 70%, 90% of 1 million parts and we need to calculate the probability of finding maximum 5000 defective parts from these sample sizes. As 1 million parts has 1% defective parts so value of success p is 0.01 and failure q is 0.99. Now the issue is when we r calculating probability of sample sizes below 50% of 1 million parts, value of probability for finding ≤ 5000 defective parts is always 0, at 50% of 1 million parts it is 0.5 and sample sizes of more than 50% give probability equal to 1. It means we only get three probability values in all sample sizes i.e 0, 0.5, 1. Now the issue is that there are no intermediate values between 0-0.5 or 0.5-1 although sample size is changing linearly. Can someone plz mention the issue in this problem. I will be very grateful

HallsofIvy · Jul 5, 2013

irum said:
Suppose there are 1 million parts which have 1% defective parts i.e 1 million parts have 10000 defective parts. Now suppose we are taking different sample sizes from 1 million like 10%, 30%, 50%, 70%, 90% of 1 million parts and we need to calculate the probability of finding maximum 5000 defective parts from these sample sizes. As 1 million parts has 1% defective parts so value of success p is 0.01 and failure q is 0.99. Now the issue is when we r calculating probability of sample sizes below 50% of 1 million parts, value of probability for finding ≤ 5000 defective parts is always 0, at 50% of 1 million parts it is 0.5 and sample sizes of more than 50% give probability equal to 1.

You've lost me here. What do you mean by "probability of samples sizes"? The probability of what?
I assume that by "sample sizes below 50%" you mean "samples of 500000 parts or less". If there are 1% defective parts, then there would be an average of .01(500000)= 5000 defective parts in a sample of 500000 parts, still a probability, of course, of 1%.

It means we only get three probability values in all sample sizes i.e 0, 0.5, 1. Now the issue is that there are no intermediate values between 0-0.5 or 0.5-1 although sample size is changing linearly. Can someone plz mention the issue in this problem. I will be very grateful

I have no idea where you got that "0", "0.5", "1".

irum · Jul 5, 2013

HallsofIvy said:
You've lost me here. What do you mean by "probability of samples sizes"? The probability of what?
I assume that by "sample sizes below 50%" you mean "samples of 500000 parts or less". If there are 1% defective parts, then there would be an average of .01(500000)= 5000 defective parts in a sample of 500000 parts, still a probability, of course, of 1%.

U didnt get my problem correctly. Actually total parts were 1 million which have 1% defective parts. So defective parts were 10000 in 1 million parts. Now I want to take different sample size from these 1 million parts i.e I m taking 10%, 30%, 50%,70% and 90% of these 1 million parts. Here I want to calculate probability of finding maximum 5000 defective parts from each sample size. Suppose I m taking sample size 10% of 1 million parts which is 100000 parts. Now to calculate probability of finding 5000 defective parts in this sample size is 0 which is found by using binomial distribution where success of finding defective parts p=0.01 (as we have 1% of defective parts in 1 million parts), failure q=0.99, r=5000 and n=100000. Similarly when I take any sample size whose fraction is less than 50% of 1 million parts, probability is always zero. When I take sample size 50% of 1 million parts i.e 500000, probability of finding 5000 defective parts is 0.5 and when I take any sample size above 50% of 1 million parts, probability of finding 5000 defective parts is 1. My issue is that why i m not getting any value between 0-0.5 or 0.5-1 although sample sizes are changing linearly. I think now I make my point clear so can you please answer my question now[FONT=MathJax_Main][/FONT][FONT=MathJax_Main][/FONT]

HallsofIvy · Jul 5, 2013

It's hard to answer because you seem to be asserting things that simply are not true. You say, for example, "when I take any sample size above 50% of 1 million parts, probability of finding 5000 defective parts is 1."

According to you there are 10000 defective parts out of all 1000000. If you take, say 500001, there are 499999 that are NOT in the sample. It is quite possible that all 10000 defective parts are in those 499999 and so, in that case, there are NO defective parts in your sample. I have no idea how you arrive at the idea that it is certain that there will be 5000 defective parts in those 500001.

What is true is that the probability that any one part is defective is .01, the probability that any one part is not defective is .99. The probability that there are a specific number of defectives in a sample of, say, N, is the binomial distribution. With N as large as 500000, that can be well approximated by the Normal distribution with mean .01N and standard distribution

\displaystyle \sqrt{N(.01)(.99)}

.

DrPhil · Jul 5, 2013

irum said:
Suppose there are 1 million parts which have 1% defective parts i.e 1 million parts have 10000 defective parts. Now suppose we are taking different sample sizes from 1 million like 10%, 30%, 50%, 70%, 90% of 1 million parts and we need to calculate the probability of finding maximum 5000 defective parts from these sample sizes. As 1 million parts has 1% defective parts so value of success p is 0.01 and failure q is 0.99. Now the issue is when we r calculating probability of sample sizes below 50% of 1 million parts, value of probability for finding ≤ 5000 defective parts is always 0,

WRONG - the sample is random, so there is a distribution of how many bad ones you get. The average for samples of size N is 0.01*N. Even if you sample 10% (i,e., N = 100,000, mean = 1,000) there is a very small possibility that all 10,000 are defective.

1 at 50% of 1 million parts it is 0.5 and sample sizes of more than 50% give probability equal to 1. It means we only get three probability values in all sample sizes i.e 0, 0.5, 1. Now the issue is that there are no intermediate values between 0-0.5 or 0.5-1 although sample size is changing linearly. Can someone plz mention the issue in this problem. I will be very grateful

I think the issue here is to determine the parameters of the distribution of sample means. I would also like for you to clarify what sample sizes you are taking .. practical samples are a small percentage of the total.

Your statement about 0 probability of having more than 5,000 defects is true for N<5000, which would be a sample of 0.5%. I think you should consider 1%, 2%, 5%, 10%.

Do you know how to use the "sampling theorem" to find the distribution of sample sizes?

irum · Jul 5, 2013

DrPhil said:
WRONG - the sample is random, so there is a distribution of how many bad ones you get. The average for samples of size N is 0.01*N. Even if you sample 10% (i,e., N = 100,000, mean = 1,000) there is a very small possibility that all 10,000 are defective.
I think the issue here is to determine the parameters of the distribution of sample means. I would also like for you to clarify what sample sizes you are taking .. practical samples are a small percentage of the total.

Your statement about 0 probability of having more than 5,000 defects is true for N<5000, which would be a sample of 0.5%. I think you should consider 1%, 2%, 5%, 10%.

Do you know how to use the "sampling theorem" to find the distribution of sample sizes?

Thanks for your reply but honestly I am beginner in statistics. I don't have any idea about sampling theorem but let me clarify my sample sizes. I m taking different fractions of 1 million parts. Different sample sizes are

10% of 1 million (100000)
30% of 1 million (300000)
50% of 1 million (500000)
70% of 1 million (700000)
90% of 1 million (900000)

In each sample size I need to calculate probability of finding < 5000 defective parts. I am using success value same for all sample sizes i.e p=0.01 as 1 million parts has 1% defective parts, while failure is q=0.99. For each sample size I m only getting probability either equal to 0, 0.5 or 1. but no other values are found so can u help me why there are no intermediate values of probability between 0-0.5 or 0.5-1 although sample sizes are changing linearly

DrPhil · Jul 5, 2013

irum said:
Thanks for your reply but honestly I am beginner in statistics. I don't have any idea about sampling theorem but let me clarify my sample sizes. I m taking different fractions of 1 million parts. Different sample sizes are

10% of 1 million (100000)

30% of 1 million (300000)

50% of 1 million (500000)

70% of 1 million (700000)

90% of 1 million (900000)

In each sample size I need to calculate probability of finding < 5000 defective parts. I am using success value same for all sample sizes i.e p=0.01 as 1 million parts has 1% defective parts, while failure is q=0.99. For each sample size I m only getting probability either equal to 0, 0.5 or 1. but no other values are found so can u help me why there are no intermediate values of probability between 0-0.5 or 0.5-1 although sample sizes are changing linearly

This is different enough from usual questions that I want to rephrase it. Divide 1000000 items into two groups: "In Sample," and "Not In Sample." 1% (10000) of the total number of items are defective. What is the probability that the "In Sample" group has less than half of the defectives?

At 50%, the two groups are equal and the probabilities of which group has less than half are equal: P(In | 50%) = 1/2.

10% In and 90% In are complementary, since 90% In = 10% Not In. That is, P(In | 10%) = P(Not In | 90%)
..................P(In | 90%) = 1 - P(In | 10%)

Likewise, ....P(In | 70%) = 1 - P(In | 30%)

What is left to calculate are the probabilities that a sample of 10% has less than half of the defectives, and the probability that a sample of 30% has less than half of the defectives. The problem is that strictly speaking, the probability p is not constant when the sample is a sizable fraction of the total population. You can get away with 10%, but 30% is questionable. [You definitely can not assume p is constant for the 50%, 70%, or 90% cases, but we have taken care of those.] BUT if that is the only tool you have, then you have to use it. Do you know how to make a normal distribution that approximates the binomial? If p=0.01,
......mean

\displaystyle \mu = N\ p = 0.01\ N

......standard deviation

\displaystyle \sigma = \sqrt{N\ p (1-p)} = \sqrt{0.0099\ N}

......

\displaystyle z = (5000 - \mu)/\sigma

and look up P(<z) in a table of the normal distribution.

EDIT: ok - I can see why you say probability is "0" .. it is very small except passing through 50%.

EDIT AGAIN: changes really fast going through 50%. I estimate that a change of 7000 out of 500000 for N changes z by 1.
....................Perhaps that is the real answer to your original question:

Now the issue is that there are no intermediate values between 0-0.5 or 0.5-1 although sample size is changing linearly. Can someone plz mention the issue in this problem. I will be very grateful

The intermediate values change too fast. Between ~ 493000 and 507000, z changes from -1 to+1.

irum · Jul 5, 2013

DrPhil said:
This is different enough from usual questions that I want to rephrase it. Divide 1000000 items into two groups: "In Sample," and "Not In Sample." 1% (10000) of the total number of items are defective. What is the probability that the "In Sample" group has less than half of the defectives?

At 50%, the two groups are equal and the probabilities of which group has less than half are equal: P(In | 50%) = 1/2.

10% In and 90% In are complementary, since 90% In = 10% Not In. That is, P(In | 10%) = P(Not In | 90%)
..................P(In | 90%) = 1 - P(In | 10%)

Likewise, ....P(In | 70%) = 1 - P(In | 30%)

What is left to calculate are the probabilities that a sample of 10% has less than half of the defectives, and the probability that a sample of 30% has less than half of the defectives. The problem is that strictly speaking, the probability p is not constant when the sample is a sizable fraction of the total population. You can get away with 10%, but 30% is questionable. [You definitely can not assume p is constant for the 50%, 70%, or 90% cases, but we have taken care of those.] BUT if that is the only tool you have, then you have to use it. Do you know how to make a normal distribution that approximates the binomial? If p=0.01,
......mean $\displaystyle \mu = N\ p = 0.01\ N$
......standard deviation $\displaystyle \sigma = \sqrt{N\ p (1-p)} = \sqrt{0.0099\ N}$
...... $\displaystyle z = (5000 - \mu)/\sigma$
and look up P(<z) in a table of the normal distribution.

EDIT: ok - I can see why you say probability is "0" .. it is very small except passing through 50%.

EDIT AGAIN: changes really fast going through 50%. I estimate that a change of 7000 out of 500000 for N changes z by 1.
....................Perhaps that is the real answer to your original question: The intermediate values change too fast. Between ~ 493000 and 507000, z changes from -1 to+1.

Thank you so much for your reply but can u please tell me that keeping p value constant is correct or not? I mean I am keeping p=0.01 because 1 million has 1% defective parts. If its not correct, what should be value of p in each sample. Value of r is constant in each sample size i.e I always want probability of finding 5000 defective parts in each sample. so plz guide me in this regard.

irum · Jul 6, 2013

one more thing is that I was taking samples of 1 million parts which have 1% defective parts. Another case is that suppose 100,000 parts are more added in 1 million parts and from this total 1,100,000 parts I need to calculate probability of finding at most 5000 parts. Now what will be the value of p and q? how I will calculate probability of defective parts from 1,100,00 parts; where added 100,000 parts have no defective parts. plz also guide me for this situation

DrPhil · Jul 8, 2013

irum said:
one more thing is that I was taking samples of 1 million parts which have 1% defective parts. Another case is that suppose 100,000 parts are more added in 1 million parts and from this total 1,100,000 parts I need to calculate probability of finding at most 5000 parts. Now what will be the value of p and q? how I will calculate probability of defective parts from 1,100,00 parts; where added 100,000 parts have no defective parts. plz also guide me for this situation

One of the criteria for using a binomial distribution is that there is a "population" distribution, such that every trial has the same probability (p) independent of any other trial. That is not the case in this question: there are a known number of defectives, in a known size of sample. That is more like choosing cards from a deck, without replacement. If for instance you have already selected half of the deck, the probability remaining depends on what has already been selected. If you select the entire deck, there is no uncertainty at all: you know precisely how many defectives there are.

As long the sample size N is small compared to the total number, you can take the ratio (defectives)/(total) to be an very good estimator of p, and you can accurately approximate the distribution of sample means to have mean = p*N and standard deviation sqrt[N*p*(1-p)]. That is, the binomial distribution is good. But as N gets bigger, the upper limit on the number of defects causes the standard deviation to be smaller than predicted by the binomial distribution. In fact, if N = 1000000, then you know exactly the number of defectives, and the standard deviation is zero.

I consider the question to be poorly framed, such that you can't do it "right." If they just asked what the expected mean should be in each case, you can use p*N to get that. But to find the probability of 5000 of the defectives being in one part or the other, you need to know the standard deviation. Go ahead and use the binomial approximation for 10% and 30%, state that the 50% case is precisely on the mean so you don't have to know the width of the distribution, and use the 10% and 30% results to get 90% and 70%.

If you add another 100000 to the total, with no more defectives, then p = (defectives)/(total) --> 1/110

irum · Jul 8, 2013

DrPhil said:
One of the criteria for using a binomial distribution is that there is a "population" distribution, such that every trial has the same probability (p) independent of any other trial. That is not the case in this question: there are a known number of defectives, in a known size of sample. That is more like choosing cards from a deck, without replacement. If for instance you have already selected half of the deck, the probability remaining depends on what has already been selected. If you select the entire deck, there is no uncertainty at all: you know precisely how many defectives there are.

As long the sample size N is small compared to the total number, you can take the ratio (defectives)/(total) to be an very good estimator of p, and you can accurately approximate the distribution of sample means to have mean = p*N and standard deviation sqrt[N*p*(1-p)]. That is, the binomial distribution is good. But as N gets bigger, the upper limit on the number of defects causes the standard deviation to be smaller than predicted by the binomial distribution. In fact, if N = 1000000, then you know exactly the number of defectives, and the standard deviation is zero.

I consider the question to be poorly framed, such that you can't do it "right." If they just asked what the expected mean should be in each case, you can use p*N to get that. But to find the probability of 5000 of the defectives being in one part or the other, you need to know the standard deviation. Go ahead and use the binomial approximation for 10% and 30%, state that the 50% case is precisely on the mean so you don't have to know the width of the distribution, and use the 10% and 30% results to get 90% and 70%.
,
If you add another 100000 to the total, with no more defectives, then p = (defectives)/(total) --> 1/110

sorry Dr Phil I couldnt understand your last line. According to my understanding you r are saying that as 1 million part has 10000 defective parts, n if i add more 10% to this 1 million parts where added 10% has no defective part, then value of p will be 10000/11000000. am i right?

DrPhil · Jul 9, 2013

irum said:
sorry Dr Phil I couldnt understand your last line. According to my understanding you r are saying that as 1 million part has 10000 defective parts, n if i add more 10% to this 1 million parts where added 10% has no defective part, then value of p will be 10000/11000000....too many zeros! am i right?

10000/1100000.
By increasing the total by 10% with no increase in the number of defects, p is decreased by 10%.

irum · Jul 10, 2013

DrPhil said:
10000/1100000.
By increasing the total by 10% with no increase in the number of defects, p is decreased by 10%.

thanks for your reply so if I am adding more 20%, 30%, 40%, 50% till 90% parts to 1 million parts with no increase in number of defects, value of p will be 10000/1200000, 10000/1300000, 10000/1400000, 10000/1500000 to 10000/1900000 accordingly. Now the issue is when I am calculating probability of finding 5000 defective parts in each sample, value of probability is same for all 10%-90% samples. Where I am wrong?

DrPhil · Jul 10, 2013

irum said:
thanks for your reply so if I am adding more 20%, 30%, 40%, 50% till 90% parts to 1 million parts with no increase in number of defects, value of p will be 10000/1200000, 10000/1300000, 10000/1400000, 10000/1500000 to 10000/1900000 accordingly. Now the issue is when I am calculating probability of finding 5000 defective parts in each sample, value of probability is same for all 10%-90% samples. Where I am wrong?

We still haven't seen how YOU are estimating the probabilities.

If p changes, the standard deviation changes.
If the standard deviation changes, the z-score changes.
But if the magnitude of z is greater than 5 or 10, the probability is either 0 or 1 for all practical purposes.

The only formula available for standard deviation is to assume a binomial distribution. Even though that formula is wrong when the sample size is a considerable fraction of the total population, it must be what you are "expected" to use. Am I wrong about that? do you have another method? Show us your work! What happens if you calculate for 49% and 51%?

irum · Jul 10, 2013

DrPhil said:
We still haven't seen how YOU are estimating the probabilities.

If p changes, the standard deviation changes.
If the standard deviation changes, the z-score changes.
But if the magnitude of z is greater than 5 or 10, the probability is either 0 or 1 for all practical purposes.

The only formula available for standard deviation is to assume a binomial distribution. Even though that formula is wrong when the sample size is a considerable fraction of the total population, it must be what you are "expected" to use. Am I wrong about that? do you have another method? Show us your work! What happens if you calculate for 49% and 51%?

well my issue is that when i calculate mean np and standard deviation sqrt[N*p*(1-p], their values are same for all samples. e.g for 20% added parts with no defects value of p is 10000/1200000. without approximation value of p=0.0083333333 and 1-p=0.99166666. here value of np=10000 and standard deviation is 99.58. Similarly if i take sample adding 90% to 1 million, value of p is 1000/1900000 = 0.0052631579 and 1-p=0.9947368 so value of mean is again 10000 and standard deviation is 99.73. I dont know why values are getting same even sample is changing from 10%-90%?

JeffM · Jul 10, 2013

irum said:
HallsofIvy said:

You've lost me here. What do you mean by "probability of samples sizes"? The probability of what?
I assume that by "sample sizes below 50%" you mean "samples of 500000 parts or less". If there are 1% defective parts, then there would be an average of .01(500000)= 5000 defective parts in a sample of 500000 parts, still a probability, of course, of 1%.

U didnt get my problem correctly. Actually total parts were 1 million which have 1% defective parts. So defective parts were 10000 in 1 million parts. Now I want to take different sample size from these 1 million parts i.e I m taking 10%, 30%, 50%,70% and 90% of these 1 million parts. Here I want to calculate probability of finding maximum 5000 defective parts from each sample size. Suppose I m taking sample size 10% of 1 million parts which is 100000 parts. Now to calculate probability of finding 5000 defective parts in this sample size is 0 which is found by using binomial distribution where success of finding defective parts p=0.01 (as we have 1% of defective parts in 1 million parts), failure q=0.99, r=5000 and n=100000. Similarly when I take any sample size whose fraction is less than 50% of 1 million parts, probability is always zero. When I take sample size 50% of 1 million parts i.e 500000, probability of finding 5000 defective parts is 0.5 and when I take any sample size above 50% of 1 million parts, probability of finding 5000 defective parts is 1. My issue is that why i m not getting any value between 0-0.5 or 0.5-1 although sample sizes are changing linearly. I think now I make my point clear so can you please answer my question now

Click to expand...

I am going back to the beginning of this thread.

Let u = number in population.

Let d = number of defectives in population, where 0 < d < u.

Let s = number in random sample, where 0 < s < u - d.

Let c = cap on number of defectives in sample, where -1 < c < (s + 1) and c < (d + 1)

Let n(k) = probability of exactly k defectives in sample, where - 1 < k < (c + 1).

Let p(c) = probability of no more than c defectives in sample.

$\displaystyle \displaystyle n(k) = \binom{s}{k} * \binom{d}{k} * \binom{u - d}{s - k} \div \binom{u}{s} \implies$

$\displaystyle n(k) = \dfrac{s!}{k! * (s - k)!} * \dfrac{d!}{k! * (d - k)!} * \dfrac{(u - d)!}{(s - k)! * (u + k - d - s)!} * \dfrac{s! * (u - s)!}{u!} > 0.$

$\displaystyle \displaystyle p(c) = \sum_{i=0}^cn(i) > 0.$

Your statements that the probabilities are either 0 or 1 depending on sample size are wrong.

Now for numbers as large as yours, you either need to do some programming to calculate the probabilities or find some approximations that make sense. Probably integrals will give you a decent approximation, but I have forgotten too much of my calculus to give it a try.

DrPhil · Jul 10, 2013

irum said:
well my issue is that when i calculate mean np and standard deviation sqrt[N*p*(1-p], their values are same for all samples. e.g for 20% added parts with no defects value of p is 10000/1200000. without approximation value of p=0.0083333333 and 1-p=0.99166666. here value of np=10000 and standard deviation is 99.58. Similarly if i take sample adding 90% to 1 million, value of p is 1000/1900000 = 0.0052631579 and 1-p=0.9947368 so value of mean is again 10000 and standard deviation is 99.73. I dont know why values are getting same even sample is changing from 10%-90%?

The number of defectives does not change. If you look at 1/10 of the world, the expected number of defectives it 1/10 of all the defectives in the world - which will not change. The standard deviation is different, and thus the z-scores are different, and thus the probability that the question asked for is "different." But that doesn't matter very much. Like the difference between 10^(-1000) and 10^(-10000). Practically, those are both "0", even though one of them is 10^9000 times as big as the other.

Have you compared sample sizes of 49% and 51%, as I suggested? Those should NOT give 0 and 1 for the probabilities.

This problem is not worth the effort we are all putting into it. Just go ahead and plug into the erroneous formula, get what they want, and be done with it.

irum · Jul 10, 2013

DrPhil said:
The number of defectives does not change. If you look at 1/10 of the world, the expected number of defectives it 1/10 of all the defectives in the world - which will not change. The standard deviation is different, and thus the z-scores are different, and thus the probability that the question asked for is "different." But that doesn't matter very much. Like the difference between 10^(-1000) and 10^(-10000). Practically, those are both "0", even though one of them is 10^9000 times as big as the other.

Have you compared sample sizes of 49% and 51%, as I suggested? Those should NOT give 0 and 1 for the probabilities.

This problem is not worth the effort we are all putting into it. Just go ahead and plug into the erroneous formula, get what they want, and be done with it.

Sorry Dr Phil I am IT person and not good in statistics so I m not understanding your point. And this problem is imp for me as my all research work depends on it. I have calculated mean and standard deviation for 49% and 51% but still their values are approximately same. I don't know where I am wrong? I mean why all samples are giving same mean and standard deviation values even I am changing samples from 10%-90%. Plz tell me what should I do? Its imp for my thesis.

JeffM · Jul 11, 2013

irum said:
Sorry Dr Phil I am IT person and not good in statistics so I m not understanding your point. And this problem is imp for me as my all research work depends on it. I have calculated mean and standard deviation for 49% and 51% but still their values are approximately same. I don't know where I am wrong? I mean why all samples are giving same mean and standard deviation values even I am changing samples from 10%-90%. Plz tell me what should I do? Its imp for my thesis.

You have not shown any calculations. How do we know what you are doing and whether it is correct or not?

Earlier you asked for a probability. Now you are talking about means and standard deviations. A mean is not a probability. A standard deviation is not a probability. At this point, no one can be sure what you are asking.

If this is important for your thesis, maybe you should take the time to ask a question that is well formulated and comprehensible and to show the work that you are asking us to help with.

Originally you said that you wanted to find the probability that a sample would contain at most 5000 defectives. Is that the problem?

Are you sampling with replacement or without replacement?

Here is an easy problem. There is an urn containing 20 balls, 15 red, and 5 blue. You choose three balls at random without replacement. What is the probability that 2 are blue? How did you calculate that?

irum · Jul 11, 2013

JeffM said:
You have not shown any calculations. How do we know what you are doing and whether it is correct or not?

Earlier you asked for a probability. Now you are talking about means and standard deviations. A mean is not a probability. A standard deviation is not a probability. At this point, no one can be sure what you are asking.

If this is important for your thesis, maybe you should take the time to ask a question that is well formulated and comprehensible and to show the work that you are asking us to help with.

Originally you said that you wanted to find the probability that a sample would contain at most 5000 defectives. Is that the problem?

Are you sampling with replacement or without replacement?

Here is an easy problem. There is an urn containing 20 balls, 15 red, and 5 blue. You choose three balls at random without replacement. What is the probability that 2 are blue? How did you calculate that?

ok let me formulate problem for you. Suppose we have 1 million parts which have 1% defective parts. so there are 10000 defective parts in 1 million parts. Now If i add 10% more parts to these 1 million parts with no defects added, total I have 1100000 parts and value of success for finding defects parts, p is 10000/1100000 from these 1100000 parts. I want to calculate probability of finding at most 5000 defective parts from these 1100000 parts. I am calculating mean and standard deviation because I m using normal approximation, Besides you can see Dr Phil earlier replies to understand the need of mean and standard deviation calculation. My issue is if I am calculating mean and standard deviation for 10% added parts (total )1100000 parts, 20%(1200000), 30%(1300000),40%(1400000) till 90% added parts (1900000 parts) value of mean and standard deviation is same. I cant understand where the problem is?

probability distribution

New member

Elite Member

New member

Elite Member

Senior Member

New member

Senior Member

New member

New member

Senior Member

New member

Senior Member

New member

Senior Member

New member

Elite Member

Senior Member

New member

Elite Member

New member