Hello,
I would like to ask for help in clarifying something about the t and Z distribution, more specifically, which one is correct to choose in the situation below. To make it easy to talk about it, I would like to start by writing an example.
I have a sample of 14 people with height data, the sample mean is 150, the standard deviation is 20. The question is what is the probability that someone is over 200 cm.
If I use Z-values, the calculation is simple:
I standardize the value of 200, i.e I calculate the Z-value for 180: Z(200) = (200-150) / 20 = 2.5 then I find the right-hand probability for Z = 2.5, which is 0.00621. But this does not take into account that it is a sample and not the population, nor does it take into account the samplesize.
When I look for a similar calculation of the t-value, I find everywhere only the formula that the one-sample t-test uses (of course, I also find the two-sample formula, but it is completely irrelevant to the present question):
I see two solutions from here.
(1) I rephrase the question to correspond to a one-sample t-test, i.e., I consider 180 as the population mean and examine what is the probability that my sample differs from this population.
(150 - 200) / (20 / Sqrt(14)) = -9.354. Then I find the left-hand probability (because what i'm testing is, that if my sample is lower than H0) for t = -9.354 df=13, which is 0.00000019.
(2) I try to make the formula similar to what can be seen when calculating the Z-value, (X - samplemean) / (sampleSD/sqrt(samplesize)) = 9.354 and then I look for the rigth-hand probability because the question is what is the probability that someone is over 200 cm, the p-value is naturally the same 0.00000019.
I would have three questions:
for (1): Is it correct to rephrase the original question to correspond to the one-sample t-test? Are these two questions theoretically the same?
for (2): Is it correct to convert the formula for calculating the t-value to be similar to the formula for calculating the z-value.
for both: Why does a lower probability come out in the t-distribution calculation? Shouldn't the tail of the t-distribution be heavier at such a small samplesize; and accordingly, estimate the values at the edge with a higher probability of occurrence?
Thank you in advance for all the explanations!
I would like to ask for help in clarifying something about the t and Z distribution, more specifically, which one is correct to choose in the situation below. To make it easy to talk about it, I would like to start by writing an example.
I have a sample of 14 people with height data, the sample mean is 150, the standard deviation is 20. The question is what is the probability that someone is over 200 cm.
If I use Z-values, the calculation is simple:
I standardize the value of 200, i.e I calculate the Z-value for 180: Z(200) = (200-150) / 20 = 2.5 then I find the right-hand probability for Z = 2.5, which is 0.00621. But this does not take into account that it is a sample and not the population, nor does it take into account the samplesize.
When I look for a similar calculation of the t-value, I find everywhere only the formula that the one-sample t-test uses (of course, I also find the two-sample formula, but it is completely irrelevant to the present question):
(samplemean - populationmean) / (sampleSD / Sqrt(samplesize))
I see two solutions from here.
(1) I rephrase the question to correspond to a one-sample t-test, i.e., I consider 180 as the population mean and examine what is the probability that my sample differs from this population.
(150 - 200) / (20 / Sqrt(14)) = -9.354. Then I find the left-hand probability (because what i'm testing is, that if my sample is lower than H0) for t = -9.354 df=13, which is 0.00000019.
(2) I try to make the formula similar to what can be seen when calculating the Z-value, (X - samplemean) / (sampleSD/sqrt(samplesize)) = 9.354 and then I look for the rigth-hand probability because the question is what is the probability that someone is over 200 cm, the p-value is naturally the same 0.00000019.
I would have three questions:
for (1): Is it correct to rephrase the original question to correspond to the one-sample t-test? Are these two questions theoretically the same?
for (2): Is it correct to convert the formula for calculating the t-value to be similar to the formula for calculating the z-value.
for both: Why does a lower probability come out in the t-distribution calculation? Shouldn't the tail of the t-distribution be heavier at such a small samplesize; and accordingly, estimate the values at the edge with a higher probability of occurrence?
Thank you in advance for all the explanations!