Probability of type 1 and type 2 errors: Add 50 to each value in this list, and...

StatsBeginner

New member
Joined
Nov 10, 2018
Messages
2
Hello,

I really need help with my statistics homework.

1. Add 50 to each value in this list, and thus derive a completely new list of values. Calculate the mean and st. deviation of the new list of values. Now take the value 94 and calculate its z-value with regards to the mean of the first distribution (old list) and then with regards to the mean of the second distribution (new list). To which distribution is it more likely to belong? What is the probability that you are making an error when you say it belongs to one or the other distribution (Type I and Type II errors)?

You can ignore the first part of the question. I have the values but I don't know how to find the probability for type 1 and type 2 errors.

Z-value of 94 using the mean of the first distribution (old list) is 1.91548. Mean = 50.0159

Z-value of 94 using the mean of the second distribution (new list) is -2.00396. Mean = 100.0159

How do I find the probability of type 1 and type 2 errors? Just give me an idea because at the moment I just can't comprehend the concept.

2. Align the two distributions so that the probability of making both the Type I and Type II errors are 1% (alpha = 0.01 and beta = 0.01) by manipulating the number of participants (n). What is the number of participants you need to achieve this and for what value these two error rates are aligned?

I'm completely lost in this one? Please put me on the right track.
 
Hi StatsBeginner,

Let's consider the first case first:

You're looking for evidence that value 94 is unusual somehow. In this case, that means it being a significant-enough outlier that you begin to suspect that it isn't a random sample drawn from a population having a Normal distribution with mean = 50.0159 and standard deviation = whatever you computed for the std. deviation of your values.

So the null hypothesis in this case would just be that value 94 does belong to this population with this distribution i.e. there is nothing unusual about it. In other words, its deviation from the mean is perfectly consistent with the observed deviation from the mean of values in this population. How do we decide whether this hypothesis is true or false? What's our criterion?

The first thing we need to do is compute the probability of having a value this far away from the mean in the Normal distribution. If we take all the values and subtract their mean, then they should now be distributed around a mean of 0. If we then divide all those values by sigma (the standard deviation that we observed), producing a set of Z-scores, we have now normalized them so that the values are expressed in units of "number of standard deviations from the mean." In other words, the Z-scores of the values should be distributed like a standard Normal distribution: one that has mean 0 and standard deviation of 1. The advantage of this is that we know how to compute the probability of being a certain number of standard deviations away from 0 in this standard Normal distribution. I.e. we know how to compute the probability of getting a Z-score at least as high as we got (or higher). Since this is an introductory stats course, I assume that you've been given a look up table:tabulated values that show you the probability corresponding to a given Z-score.

In the case of the first distribution (original values) our Z-score for value 94 was +1.91548. In other words, value 94 was 1.91548 standard deviations above the mean of the distribution. How likely is it to get a value this high above the mean, or higher? You can look it up in your Z-score-to-probability table. I get an answer of about 0.027. There is only a 2.7% chance that you'd see a value equal to or greater than value 94, in this population.

What are your options?

1) you can reject the null hypothesis: i.e. you can say that value 94's deviation from the mean is statistically-significant. It looks like it doesn't belong to this distribution. If you do so, you run the risk of being wrong, of having incorrectly rejected the null hypothesis. This is a Type I error. We also call it a "false positive", because you're claiming to see a positive (non-null) effect, i.e. significant deviation from the expected mean value, when there is no such effect present.

The probability of making this mistake is equal to the probability we just computed: if the null hypothesis were true, then 2.7 times out of 100, you would expect to see a value this large. Thus

\(\displaystyle \mathrm{probability~of~Type~I~error} = \alpha = 0.027 \)

2) you can fail to reject the null hypothesis: I.e. you can say that value 94's deviation from the mean is not necessarily statistically-significant. You think it could belong to this distribution. If you do so, you run the risk of being wrong, of having incorrectly failed to reject the null hypothesis. This is a Type II error. We also call it a "false negative", because you're claiming to see a negative (null) result, when in fact significant deviation from the expected mean value was present due to some effect that is alternative to the null hypothesis.

The probability \(\displaystyle \beta\) of getting a false negative depends on what your alternative to the null hypothesis is. In this case, I assume you're supposed to consider the alternative that the value belongs to the other distribution (the new values) with mean 100.0159.

Can you take it from here?
 
Last edited:
Hi StatsBeginner,

Let's consider the first case first:

You're looking for evidence that value 94 is unusual somehow. In this case, that means it being a significant-enough outlier that you begin to suspect that it isn't a random sample drawn from a population having a Normal distribution with mean = 50.0159 and standard deviation = whatever you computed for the std. deviation of your values.

So the null hypothesis in this case would just be that value 94 does belong to this population with this distribution i.e. there is nothing unusual about it. In other words, its deviation from the mean is perfectly consistent with the observed deviation from the mean of values in this population. How do we decide whether this hypothesis is true or false? What's our criterion?

The first thing we need to do is compute the probability of having a value this far away from the mean in the Normal distribution. If we take all the values and subtract their mean, then they should now be distributed around a mean of 0. If we then divide all those values by sigma (the standard deviation that we observed), producing a set of Z-scores, we have now normalized them so that the values are expressed in units of "number of standard deviations from the mean." In other words, the Z-scores of the values should be distributed like a standard Normal distribution: one that has mean 0 and standard deviation of 1. The advantage of this is that we know how to compute the probability of being a certain number of standard deviations away from 0 in this standard Normal distribution. I.e. we know how to compute the probability of getting a Z-score at least as high as we got (or higher). Since this is an introductory stats course, I assume that you've been given a look up table:tabulated values that show you the probability corresponding to a given Z-score.

In the case of the first distribution (original values) our Z-score for value 94 was +1.91548. In other words, value 94 was 1.91548 standard deviations above the mean of the distribution. How likely is it to get a value this high above the mean, or higher? You can look it up in your Z-score-to-probability table. I get an answer of about 0.027. There is only a 2.7% chance that you'd see a value equal to or greater than value 94, in this population.

What are your options?

1) you can reject the null hypothesis: i.e. you can say that value 94's deviation from the mean is statistically-significant. It looks like it doesn't belong to this distribution. If you do so, you run the risk of being wrong, of having incorrectly rejected the null hypothesis. This is a Type I error. We also call it a "false positive", because you're claiming to see a positive (non-null) effect, i.e. significant deviation from the expected mean value, when there is no such effect present.

The probability of making this mistake is equal to the probability we just computed: if the null hypothesis were true, then 2.7 times out of 100, you would expect to see a value this large. Thus

\(\displaystyle \mathrm{probability~of~Type~I~error} = \alpha = 0.027 \)

2) you can fail to reject the null hypothesis: I.e. you can say that value 94's deviation from the mean is not necessarily statistically-significant. You think it could belong to this distribution. If you do so, you run the risk of being wrong, of having incorrectly failed to reject the null hypothesis. This is a Type II error. We also call it a "false negative", because you're claiming to see a negative (null) result, when in fact significant deviation from the expected mean value was present due to some effect that is alternative to the null hypothesis.

The probability \(\displaystyle \beta\) of getting a false negative depends on what your alternative to the null hypothesis is. In this case, I assume you're supposed to consider the alternative that the value belongs to the other distribution (the new values) with mean 100.0159.

Can you take it from here?

But if our Z-score for value 94 is +1.91548 we should be looking up at the positive Z table, shouldn't we?

z tab.jpg
If we look up at the positive Z-table, an answer I get is 0.9726.

Regarding a type II error, the Z-score for value 94 (new list of values) is -2.00396. So now I would look up at the negative Z-table, right? An answer I get is 0.0228. Probability of type II error is 0.023?
 
If we look up at the positive Z-table, an answer I get is 0.9726

Yeah, but you need to be aware of what the "positive Z-table" is telling you. It's telling you the probability of being less than or equal to the Z-score you looked up. You want the probability of being greater than or equal to it. Hint: what is one minus the value you looked up?
 
Regarding a type II error, the Z-score for value 94 (new list of values) is -2.00396. So now I would look up at the negative Z-table, right?

Yes

An answer I get is 0.0228. Probability of type II error is 0.023?

No. Again, you need to understand what the Z-table is telling you. This tells you that the probability of being this far below the mean (or farther) is only 0.023.

So that's something you know. You can use that information to determine what the probability of a Type II error is. But to do that, you need to understand what it means. Look back at the definition I gave for Type II error. Or, much better yet, look at the definition given in your notes or textbook. You'll have to think it through once you understand conceptually what Type II error is.
 
Top