Assessing Normality in Distribution

is_919

New member
Joined
Apr 2, 2011
Messages
8
In Assessing Normality, where did the pattern for the cumulative areas 1/2n 3/2n 5/2n 7/2n & so on come from?

I can see the pattern for the cumulative areas to the left of the corresponding sample values: 1/2n 3/2n 5/2n 7/2n and so on, that is:

numerator = odd numbers only (it seems)
denominator = 2 multiplied by the value of n

My question is why does the numerator consists of only odd numbers? How did this pattern come about? Why is the number 2 constant in the denominator? Does the pattern go on indefinitely?

Please help! I know it is easier to use the calculator but I really want to understand this concept!

Thanks!
 
As part of the procedure in determining whether a set of data has a normal distribution, one has to identify the areas of 1/2n, 3/2n, 5/2n, 7/2n, and so on. With a sample of size n, each value represents a proportion of 1/n of the sample. The pattern (1/2n, 3/2n, 5/2n, 7/2n ....) consists of cumulative areas to the left of the corresponding sample values. My question is how does the pattern come about? Where was it derived from?

1/2n, 3/2n, 5/2n, 7/2n ...

Why all the numerators odd numbers only? Why not use even numbers?
Why is the number 2 in the denominator constant? Why is it multiplied by the sample size?
Does the pattern continue on infinitely?
 
First, it is considered bad form to post in multiple locations. One must believe that one's own time is more important than the time of volunteers doing their best to help you. This discourages volunteers, as multiple responses may be created where only one would have been sufficient. This wastes the limited resource of volunteer time.

http://answers.yahoo.com/question/index?qid=20110402182914AApkUNP

Odd, I wonder how I missed this construction technique. Meh. We all have little holes in our education.

You have n data points. If you add one point at a time, you will construct a series of cummulative chunks of your probability distribution. Adding up how much you have produces 0/n, 1/n, 2/n, 3/n, 4/n, etc. All the way to n/n = 1 -- Representing all the data.

Now, there is a bit of a magic part. First, we'll rewrite what we just did. It will not be obvious why for a moment. Bear with me.

0/n, 1/n, 2/n, 3/n, 4/n, ... n/n = 0/(2n), 2/(2n), 4/(2n), 6/(2n), ..., (2n)/(2n)

Convince yourself that these are the same.

This sequence defines n intervals, namely, [0/(2n), 2/(2n)], [2/(2n), 4/(2n)], [4/(2n), 6/(2n)],...,[(2n-2)/(2n), (2n)/(2n)]

Notice how each interval starts where the previous left off. We're just chopping the distrubution up into tiny pieces, one for each sample item.

The last part is more a convention than an important revelation. We need a name for each interval. We also need a value to assign to each interval. It is rational to assume the midpoint of each interval. Here is the construction.

[0/(2n), 2/(2n)] ==> 1/(2n)
[2/(2n), 4/(2n)] ==> 3/(2n)
[4/(2n), 6/(2n)] ==> 5/(2n)
...
[(2n-2)/(2n), (2n)/(2n)] ==> (2n-1)/(2n)

And there you have it. Do you see why we transformed the denominators to 2n? It was ONLY so we could define the midpoints more conveniently.

Note: The notation 1/2n, 3/2n, etc is very bad. You may recall studying the Order of Operations in your pre-algebra classes. A proper presentation would include the parentheses as i have demonstrated above. Technically:
1) Without Paentheses: 3/5*2 = (3/5)*2 = 0.6*2 = 1.2
2) With Parentheses: 3/(5*2) = 3/10 = 0.3
It makes a difference.
 
Top