Statistics question

Let X be the number connected to the wrong agent in 5 consecutive calls. Then X~B(5, p)
We are told that P(X≥1)=0.049
That means P(X=0)=1-0.049 = 0.951
So [MATH]q^5=0.951[/MATH] (where q is 1-p)
[MATH]q=\sqrt[5]{0.951} = 0.99000[/MATH] (to 5 d.p.)
[MATH]\therefore p=1-\sqrt[5]{0.951}=0.01000[/MATH] (to 5 d.p.)

Let Y be the number of the 1000 calls in a day, which are connected to the wrong agent.
Then Y~B(1000,0.01) (approx)
For a Binomial random variable: Mean=np, Variance=npq
[MATH]\therefore \text{ Mean }\approx 1000 \times 0.01 = 10[/MATH]Variance [MATH]\approx 1000 \times 0.01 \times 0.99 = 9.9[/MATH]
 
Let X be the number connected to the wrong agent in 5 consecutive calls. Then X~B(5, p)
We are told that P(X≥1)=0.049
That means P(X=0)=1-0.049 = 0.951
So [MATH]q^5=0.951[/MATH] (where q is 1-p)
[MATH]q=\sqrt[5]{0.951} = 0.99000[/MATH] (to 5 d.p.)
[MATH]\therefore p=1-\sqrt[5]{0.951}=0.01000[/MATH] (to 5 d.p.)

Let Y be the number of the 1000 calls in a day, which are connected to the wrong agent.
Then Y~B(1000,0.01) (approx)
For a Binomial random variable: Mean=np, Variance=npq
[MATH]\therefore \text{ Mean }\approx 1000 \times 0.01 = 10[/MATH]Variance [MATH]\approx 1000 \times 0.01 \times 0.99 = 9.9[/MATH]
thanks - this is helpful
 
Good. Let me know if any of it needs explained further.
I have a second problem. I have done part a but I dont get part b) Like why is the dof 2. I cant understand how she has worked out the Ei(expected frequencies) if she didnt use binomial dist.



Screenshot 2021-05-08 at 17.34.30.png
Screenshot 2021-05-08 at 19.07.59.png
 
The last group is small so it is combined with the previous one. That gives 4-1 df, but also p is calculated so 4-1-1.
1620505939291.png
 
She calculated a value for p to replace part (a)'s 0.05. We don't know what it is, but she calculated the expected values the way you did in part (a).
(Btw to calculate it she just took the observed numbers of fakes from the previous table and discovered there were [MATH]43\times10+62\times 1+26 \times 2 + 13 \times 3 +\text{ (approx) }6 \times 4[/MATH]Which is 177 fake coins out of the 150 bags of 20 coins, i.e. out of 3000 coins.
So calculated prob of fake [MATH]p=\tfrac{177}{3000}=0.059[/MATH]
So her first estimated frequency is [MATH]150\times (1-0.059)^{20}=44.5[/MATH] (1 d.p.) etc...
 
Last edited:
She calculated a value for p to replace part (a)'s 0.05. We don't know what it is, but she calculated the expected values the way you did in part (a).
I dont get why its 4-1-1, the fact that its 4 columns so 4-1 is easy the second bit isnt.
 
Knowing the estimate value of p, calculated from the observed data, and that the total number of data is 150, takes away 2 degrees of freedom. So if we have 4 categories for the observed data, usually filling up 3 of them means the fourth is determined (by knowing the total number of data is 150). However this time filling in 2 of the categories is enough to mean the final 2 are determined because not only do we know the total number of data is 150, but we also have the calculated value of p. So if I only filled in the first 2 categories and gave you the calculated p (approx) and that there are 150 data, you would be able to work out the values for the last 2 observed categories. Hence you have only 2 degrees of freedom. (You can only fill in 2 categories before the rest are determined).
 
Knowing the estimate value of p, calculated from the observed data, and that the total number of data is 150, takes away 2 degrees of freedom. So if we have 4 categories for the observed data, usually filling up 3 of them means the fourth is determined (by knowing the total number of data is 150). However this time filling in 2 of the categories is enough to mean the final 2 are determined because not only do we know the total number of data is 150, but we also have the calculated value of p. So if I only filled in the first 2 categories and gave you the calculated p (approx) and that there are 150 data, you would be able to work out the values for the last 2 observed categories. Hence you have only 2 degrees of freedom. (You can only fill in 2 categories before the rest are determined).
i m never going to get it. I cant see why its two. There are still 4 columns and 2 rows. 2 values still leaves us with a lot of unfilled. Sorry. I will have to get a private tutor. No i dont get it at all. 'we have a calculated value of p', i dont understand any of this.
 
I take it you are doing Further Maths then. Not easy!
I wouldn't get too worried about understanding what degrees of freedom mean. I would focus on being able to calculate them correctly. (That's the level of explanation the examiner requires).
For the Binomial this is:
1620565340286.png

So you need to be able to recognise two things:
(1) How many cells there are
(2) Whether the estimate for p is calculated?

(1) The cells refer to the observed frequencies only.
There is only 1 row, and because we put the last two groups together, the number of cells is 4.
1620565643540.png

(2) You need to be able to distinguish between part (a) and part (b) of the question.
In part (a), the 0.05 estimate for p comes from simply a belief/guess, not calculation.
In part (b), the estimate of p (actually about 0.059), comes from a calculation that the person has done using the observed data in part (a).

Therefore using * above:
for part (a) - there are [MATH]\nu=4-1[/MATH] degrees of freedom
for part (b) - there are [MATH]\nu=4-2[/MATH] degrees of freedom.

( People are different and perhaps you would like an explanation of what degrees of freedom mean. I will attempt this if it is important to you, however the above is really what you need).
 
i wont worry about this anymore as there is so much to revise. I am extremely grateful for the help. :)
 
  • Like
Reactions: lex
there is one more. Its on poisson and CLT. I know the mean and variance is the same.
So N(2.3, 2/3/100), and i know Z score. but i am not getting 0.09632 but 0.09362452... I m not sure where the error is?Screenshot 2021-05-09 at 15.29.19.png
Screenshot 2021-05-09 at 15.38.53.png
 
Top