Statistics confidence intervals - help me understand given solutions?

Herondaleheir · Apr 2, 2019

So I have solutions to two statistics questions below, but I don't quite understand where some values came from and was hoping someone could clarify. I bolded the steps I didn't understand and also left a comment at the start of each solution.

Q1:
An estimate of the percentage of the defectives in a lot of pins supplied by a vendor is desired to be within 1% of the true proportion at 90% confidence level.

(a) If the actual percentage of the defectives is known to be 4%, what is the minimum sample size needed for the study?
I don't need help with a) but it explains how to get the values needed for the next question b) just incase anyone wanted to see them.

CI_Low = p - Z_critical* sqrt( (p)* (1-p)/ n)
CI_High = p + Z_Critical *sqrt( p*(1-p)/n)
so be within 1 %
+Z_critical*sqrt (p *(1-p)/n) = 0.01
Z_critcal is the Z such that P(z<Z) = 0.95 since 0.05 to 0.95 comprises
a 90 percent interval
P(z< 1.64) = 0.9495
P(z< 1.65) = 0.9505
so P(z< 1.645) = approx. 0.95
Z_critical = 1.645
1.645*sqrt ( 0.04*(1-0.04)/ n) = 0.01
sqrt( 0.0384 / n) = 0.01/1.645
sqrt(0.0384/n) = 0.006079027
square both sides
0.0384/n = 3.69546E-05
3.69546E-05n = 0.0384
n = 0.0384 / 3.69546E-05
n = 1039.1136
but n must be an integer
n = 1040

(b) If the actual percentage of the defectives is unknown, what is the minimum sample size needed for the study?
The solution is below but I don't understand why p = 0.5? What makes that the worst case?

Worst case for the value of sqrt( p*(1-p)/n) is when p =0.5
CI_Low = p - Z_critical* sqrt( (p)* (1-p)/ n)
CI_High = p + Z_Critical sqrt( p(1-p)/n)
1.645*sqrt ( p*(1-p)/ n) = 0.01
sqrt( 0.5*0.5/n) = 0.01/1.645 = 0.006079027
0.25/n = 3.69546E-05
3.69546E-05*n = 0.25
n = 0.25/3.69546E-05 = 6765.0625
n = 6766

Q2:
A statistician estimates the 92% confidence interval for the mean of a normally distributed population as (162.75, 173.25) at the end of a sampling experiment assuming a known population standard deviation.
a. Use the information given to construct the 97% confidence interval for the population mean.
The solution for this is long so I'm not going to paste all of it, but I was wondering why the tails '4%' and '1.5%' need to be added to the critical z values? I tried searching online but I couldn't figure out what formula or rule this falls under?

CI_Low = mean - Z_critical*standard deviation/sqrt(N)
CI_High = mean + Z_critical*standard deviationa/sqrt(N)
mean = (162.75 +173.25) /2 = 168
92 % confidence range has 4 % tail on both sides
Z_critical = P(Z<z)= 0.96
Z_critical = 1.750686071
P(z< 1.75 ) = 0.9599
P(z< 1.76) = 0.9608
so for 97 % confidence range
97 % has 1.5 % tails on both sides
P(z< Z) = 0.985 gives Z_critical for that
P(z<2.17) = 0.9850
so the new values will be range will be larger on both sides of the mean
by the ratios of the Z_critical values since the standard deviaiton and the number of sample did not change
so if you take that 1.75 is closest enough
we had
CI_Low = 168 - 5.25
CI_High = 168 + 5.25
now we have
CI_Low = 168- (2.17/1.75) *5.25 = 161.49
CI_High = 168 + (2.17/1.75)*5.25 = 174.51
so new range is (161.49, 174.51 )

Dr.Peterson · Apr 2, 2019

Herondaleheir said:
So I have solutions to two statistics questions below, but I don't quite understand where some values came from and was hoping someone could clarify. I bolded the steps I didn't understand and also left a comment at the start of each solution.

Q1:
An estimate of the percentage of the defectives in a lot of pins supplied by a vendor is desired to be within 1% of the true proportion at 90% confidence level.

(a) If the actual percentage of the defectives is known to be 4%, what is the minimum sample size needed for the study?
I don't need help with a) but it explains how to get the values needed for the next question b) just incase anyone wanted to see them.

(b) If the actual percentage of the defectives is unknown, what is the minimum sample size needed for the study?
The solution is below but I don't understand why p = 0.5? What makes that the worst case?

Worst case for the value of sqrt( p*(1-p)/n) is when p =0.5
CI_Low = p - Z_critical* sqrt( (p)* (1-p)/ n)
CI_High = p + Z_Critical sqrt( p(1-p)/n)
1.645*sqrt ( p*(1-p)/ n) = 0.01
sqrt( 0.5*0.5/n) = 0.01/1.645 = 0.006079027
0.25/n = 3.69546E-05
3.69546E-05*n = 0.25
n = 0.25/3.69546E-05 = 6765.0625
n = 6766

There's too much here to look at all at once, so I'll just answer Q1b.

They are saying that sqrt( p*(1-p)/n) is worst (greatest) for a given n when p is 1/2. Here is the graph of y = sqrt(x(1-x)):

Do you see that its maximum is at x=1/2? That's all they're saying. Technically, you could use calculus to prove it; but in the long run it's just a fact you'll keep in mind when you do these problems.

Dr.Peterson · Apr 2, 2019

Herondaleheir said:
Q2:
A statistician estimates the 92% confidence interval for the mean of a normally distributed population as (162.75, 173.25) at the end of a sampling experiment assuming a known population standard deviation.
a. Use the information given to construct the 97% confidence interval for the population mean.
The solution for this is long so I'm not going to paste all of it, but I was wondering why the tails '4%' and '1.5%' need to be added to the critical z values? I tried searching online but I couldn't figure out what formula or rule this falls under?

CI_Low = mean - Z_critical*standard deviation/sqrt(N)
CI_High = mean + Z_critical*standard deviationa/sqrt(N)
mean = (162.75 +173.25) /2 = 168
92 % confidence range has 4 % tail on both sides
Z_critical = P(Z<z)= 0.96
Z_critical = 1.750686071
P(z< 1.75 ) = 0.9599
P(z< 1.76) = 0.9608
so for 97 % confidence range
97 % has 1.5 % tails on both sides
P(z< Z) = 0.985 gives Z_critical for that
P(z<2.17) = 0.9850
so the new values will be range will be larger on both sides of the mean
by the ratios of the Z_critical values since the standard deviaiton and the number of sample did not change
so if you take that 1.75 is closest enough
we had
CI_Low = 168 - 5.25
CI_High = 168 + 5.25
now we have
CI_Low = 168- (2.17/1.75) *5.25 = 161.49
CI_High = 168 + (2.17/1.75)*5.25 = 174.51
so new range is (161.49, 174.51 )

Now let's look at Q2.

I would expect you to have seen this previously, so maybe I'm misinterpreting your uncertainty.

The 92% confidence interval means the middle 92% of the curve, leaving 8% outside it, 4% at each end. You are looking up the critical value at the right end of the range; P(Z<z) is the area under the entire curve to the left of this value, which includes both the 92% and the left-hand 4%, for a total of 96%. This omits only the right-hand tail.

You could also look up the left-hand end, for which P(Z<z) = 4% (the left-hand tail only). This will give the negative of the number you get the way they show, namely -1.750686071.

Statistics confidence intervals - help me understand given solutions?

Herondaleheir

New member

Dr.Peterson

Elite Member

Dr.Peterson

Elite Member