confidence interval mean vs porportion

Oli

New member
Joined
Jul 22, 2013
Messages
2
Hope one of you can help:

I've got a general quesiton on finding confidence intervals for the porportion of a population. I dont understand how it is possible to calculate it only taking into account information about the sample. We know nothing about the relationship between the sample and the population. When we do a simillar calculation for the distribution of the mean we know that the there is a relationship between the sd of the population and the sd of the means and that allows us to take information from the sample and use it to inform us about the behaviour of the population.

But that is not true for poportions. The sd of a sample is not related to the sd of the population as we can see by looking at its formula.

Am I making any sense?

thanks alot for your time
regards
Pedro


The drug Viagra became available in the U.S. in May, 1998, in the wake of an advertising campaign that was unprecedented in scope and intensity. A Gallup poll found that by the end of the first week in May, 643 out of a random sample of 1,005 adults were aware that Viagra was an impotency medication (based on "Viagra A Popular Hit," a Gallup poll analysis by Lydia Saad, May 1998).

Let's estimate the proportion p of all adults in the U.S. who by the end of the first week of May 1998 were already aware of Viagra and its purpose by setting up a 95% confidence interval for p.

We first need to calculate the sample proportion [FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT]. Out of 1,005 sampled adults, 643 knew what Viagra is used for, so [FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main]643[/FONT][FONT=MathJax_Main]1005[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64


[/FONT]

Therefore,
A 95% confidence interval for p is [FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Main]±[/FONT][FONT=MathJax_Main]2[/FONT][FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Size1]([/FONT][FONT=MathJax_Main]1[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Size1])[/FONT][FONT=MathJax_Math]n[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Size2]√[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Main]±[/FONT][FONT=MathJax_Main]2[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Size1]([/FONT][FONT=MathJax_Main]1[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Size1])[/FONT][FONT=MathJax_Main]1005[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Size2]√[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Main]±[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]03[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main]([/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]61[/FONT][FONT=MathJax_Main],[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]67[/FONT][FONT=MathJax_Main])[/FONT]
We can be 95% sure that the proportion of all U.S. adults who were already familiar with Viagra by that time was between .61 and .67 (or 61% and 67%).
The fact that the margin of error equals .03 says we can be 95% confident that unknown population proportion p is within .03 (3%) of the observed sample proportion .64 (64%). In other words, we are 95% confident that 64% is "off" by no more than 3%.
 
Hope one of you can help:

I've got a general quesiton on finding confidence intervals for the porportion of a population. I dont understand how it is possible to calculate it only taking into account information about the sample. We know nothing about the relationship between the sample and the population. When we do a simillar calculation for the distribution of the mean we know that the there is a relationship between the sd of the population and the sd of the means and that allows us to take information from the sample and use it to inform us about the behaviour of the population.

But that is not true for poportions. The sd of a sample is not related to the sd of the population as we can see by looking at its formula.

Am I making any sense?

thanks alot for your time
regards
Pedro


The drug Viagra became available in the U.S. in May, 1998, in the wake of an advertising campaign that was unprecedented in scope and intensity. A Gallup poll found that by the end of the first week in May, 643 out of a random sample of 1,005 adults were aware that Viagra was an impotency medication (based on "Viagra A Popular Hit," a Gallup poll analysis by Lydia Saad, May 1998).

Let's estimate the proportion p of all adults in the U.S. who by the end of the first week of May 1998 were already aware of Viagra and its purpose by setting up a 95% confidence interval for p.

We first need to calculate the sample proportion [FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT]. Out of 1,005 sampled adults, 643 knew what Viagra is used for, so [FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main]643[/FONT][FONT=MathJax_Main]1005[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64


[/FONT]

Therefore,
A 95% confidence interval for p is [FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Main]±[/FONT][FONT=MathJax_Main]2[/FONT][FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Size1]([/FONT][FONT=MathJax_Main]1[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Math]p[/FONT][FONT=MathJax_Size1]ˆ[/FONT][FONT=MathJax_Size1])[/FONT][FONT=MathJax_Math]n[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Size2]√[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Main]±[/FONT][FONT=MathJax_Main]2[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Size1]([/FONT][FONT=MathJax_Main]1[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Size1])[/FONT][FONT=MathJax_Main]1005[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Main]−[/FONT][FONT=MathJax_Size2]√[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]64[/FONT][FONT=MathJax_Main]±[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]03[/FONT][FONT=MathJax_Main]=[/FONT][FONT=MathJax_Main]([/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]61[/FONT][FONT=MathJax_Main],[/FONT][FONT=MathJax_Main].[/FONT][FONT=MathJax_Main]67[/FONT][FONT=MathJax_Main])[/FONT]
We can be 95% sure that the proportion of all U.S. adults who were already familiar with Viagra by that time was between .61 and .67 (or 61% and 67%).
The fact that the margin of error equals .03 says we can be 95% confident that unknown population proportion p is within .03 (3%) of the observed sample proportion .64 (64%). In other words, we are 95% confident that 64% is "off" by no more than 3%.
The "extra" information you need to relate sample and population is to see if the criteria of a Binomial Distribution are met.
1. It is a yes/no question. People either do or do not know about Viagra.
2. There is a fixed probability \(\displaystyle p\) that is true for all trials. Your sample \(\displaystyle \hat{p} = 0.64\) is an estimator of the population \(\displaystyle p\).
3. There is a fixed number of trials, \(\displaystyle N = 1005\).
These criteria are met.

For large N and Np, the binomial is well approximated by a normal distribution with same mean and standard deviation:

\(\displaystyle \mu = Np = 643\)

\(\displaystyle \sigma = \sqrt{Np(1-p)} = 15.2\)

Thus you find the population standard deviation by using your knowledge of the binomial distribution. If the standard deviation of the number of positive responses is 15.2, what is the standard deviation of \(\displaystyle \hat{p} = \mu /N\)? Can you take it from there?
 
Thanks a lot for your reply. Sorry for the delay but was traveling.

Anyway, I am reading a bit more on the subject. But I guess that the sd of P hat would be: 0,015.
Im not too sure how important it is to understand exactly how this comes about as long as one knows how to apply it.

thank you
regards
Pedro
 
Thanks a lot for your reply. Sorry for the delay but was traveling.

Anyway, I am reading a bit more on the subject. But I guess that the sd of P hat would be: 0,015.
Im not too sure how important it is to understand exactly how this comes about as long as one knows how to apply it.

thank you
regards
Pedro
What is important is to recognize that as the sample size \(\displaystyle N\) is varied, the uncertainty of \(\displaystyle p\) varies as \(\displaystyle 1/\sqrt{N}\).
 
Top