Wednesday, September 28, 2016

Z SCORES FOR SAMPLE MEANS

CAVEAT

Hopefully everything from the Descriptive vs. Influential statistics module makes sense. If not, consider reviewing it again! We are going to take that information and combine it with what you know about standard deviation to move onward!

Let's start today with a video overview of what we will be discussing:

{VIDEO}

Introduction

In the lesson on Z Scores, we learned that Z Score is nothing scary--it just means "number of standard deviations from the mean". So, what if we have a dataset with a mean of 50 and standard deviation of 15. What is the Z Score of an observation in that dataset with the value 80? 

Take a quiz to see if you remember!

{Quiz. A: Z Score=2}

That quiz is a reminder that Z Score is nothing scary. It just means "number of standard deviations from the mean!" Sometimes we have to first figure out how far away something is from the mean, but it is otherwise quite straightforward!

"So, how can we combine this with inferential statistics?"

I'm glad you asked! We can do the same thing with sample means that we can with observations in a sample.

Observations in a Sample vs. Sample Means in a Distribution

^^^^ That subtitle is a little intense! Let's break it down into English. (As always, this is about translating from Statistics language to English--or really whatever language you are learning stats in!).

Examples usually help make the point:

EXAMPLE OF OBSERVATIONS IN A SAMPLE

This one should seem familiar. Suppose we have a classroom of 10 students (it is a small classroom). They took a stats test and got the following scores:

98, 92, 90, 95, 88, 94, 95, 100, 99, 98

(They are very good students)

If we want to know the Z Score of the first observation (98), what do we do? What we have been doing all along! 

Let's try again (just for fun!).

Start by adding up the observations and dividing by the number of observations:

Xi
98
92
90
95
88
94
95
100
99
98
94.9<--mean!

Now, subtract 94.9 (the mean) from each observation to get the difference of each observation from the mean (that is the heart and soul of variance!).

Xi Xi-Xbar
98 3.1
92 -2.9
90 -4.9
95 0.1
88 -6.9
94 -0.9
95 0.1
100 5.1
99 4.1
98 3.1
Now, add up those differences. REMEMBER, THIS IS A CHECKPOINT AND SHOULD COME OUT TO ZERO, MEANING WE HAVE SUCCESSFULLY BALANCED THE SEESAW!

Xi-Xbar
3.1
-2.9
-4.9
0.1
-6.9
-0.9
0.1
5.1
4.1
3.1
-5.6843E-14

Wait! The mean is not zero, but -0.000000000000056843! 

All is well! In Stats language, we call this "rounding error". It means that we rounded to tenths (0.1) or hundredths (0.01) so our numbers are not exact. That is fine! As long as that number is tiny, we are still OK.

*The very aspiring and astute student will usually ask at this point, "How tiny is tiny?" As a rule of thumb, if this number is less than 0.01% of the mean, you are (probably) OK! So take your mean, multiply by .0001 and if the difference is less than that, chances are it is just rounding error!*

Next, square the differences so that the summed differences don't add up to only zero!

Xi Xi-Xbar diff^2
98 3.1 9.61
92 -2.9 8.41
90 -4.9 24.01
95 0.1 0.01
88 -6.9 47.61
94 -0.9 0.81
95 0.1 0.01
100 5.1 26.01
99 4.1 16.81
98 3.1 9.61
94.9 -5.6843E-14 142.9
So, we have summed differences equal to 142.9!

Now, how do we compute variance? Divide by the number of observations (n)!

142.9/10=14.29

14.29 = variance

How do we get standard deviation? S squared = S squared! Variance is S (standard deviation) squared! So, just take the square root of variance to get S (standard deviation). 

sq rt of 14.29=3.78! This is standard deviation! 

Now, to get the Z Score--the number of standard deviations from the mean--we need to know how far the observation of interest (98) is from the mean:

98-94.9=3.1

So, how do we find out how many standard deviations 3.1 is? The same way you find out how many feet 24 inches is--DIVIDE! 

3.1/3.78=0.82

So, 0.82 is the number of standard deviations the 98 is from the mean! We can translate it like this Z Score of 98=0.82. 

So, the Z Score is 0.82! 

EASY! (?)

EXAMPLE OF SAMPLE IN A DISTRIBUTION

Now, suppose we want to know the last stats exam score for every college student in America! It just became unfeasible to collect all of those scores! It will cost too much (time, money, etc). This means you will have to take the largest sample you can and hope it comes close to the "true" population average! 

OF UTMOST IMPORTANCE--> As the video shows, choosing the largest sample will give you the best chance of getting a sample average that matches the "true" population average. This is due to the "Law of Large Numbers"and the Central Limit Theorem (CLT)! More on these later...

Think back to the video now. Imagine 100,000 test scores (I won't put a column that big in this module, so your imagination will have to do!). Let us suppose that you can afford to do a sample of 100 students. You also have taken a stats class and know that you should take a random sample to give yourself the best chance at that sample average matching the "true" population average. 

Random=All element in the population have a non-zero chance of being chosen and all elements have an equal chance of being chosen. 
Because we are taking a random sample, you never know which 100 observations will be chosen. This means that there will be some variation in the sample's average compared to other possible samples. In the video, we used a simulator to show that a sample of a given size will be a little different each time if you take repeated samples. 

This means that (drum roll...) SAMPLE MEANS HAVE A DISTRIBUTION THEMSELVES!


So, now picture that we take 10 samples of 100 (10 samples of 100 observations each) and we record the average for each sample of 100. The first has an average score of 78, the second has an average of 72, the third of 70 and so on.

Here are the averages for all 10 samples of 100:

78, 72, 70, 75, 68, 74, 75, 80, 79, 78

THERE IS GOOD NEWS AND BAD NEWS!

The good news is that we can still compute Z Scores in the same GENERAL way we have been doing! The bad news is that we now have to make some adjustments because we are not collecting all the information in the population--we are trying to estimate it.

When we estimate, we must adjust.
When we estimate, we must adjust.
When we estimate, we must adjust. 


Let's first look at what we do exactly the same way as what we have been doing:

Means Mean-Avg. of the means diff^2
78 3.1 9.61
72 -2.9 8.41
70 -4.9 24.01
75 0.1 0.01
68 -6.9 47.61
74 -0.9 0.81
75 0.1 0.01
80 5.1 26.01
79 4.1 16.81
78 3.1 9.61
74.9 -5.68434E-14 142.9

Notice that there are slight modifications to the headings because we now have Means in column 1. We still take the average of that column just like always and it gets a little awkward in the wording because we now have a mean of the means...


To slip out of that awkward wording, we call it THE GRAND MEAN!

The Grand Mean!
Here it is again, with the new title:

Means Grand Mean diff^2
78 3.1 9.61
72 -2.9 8.41
70 -4.9 24.01
75 0.1 0.01
68 -6.9 47.61
74 -0.9 0.81
75 0.1 0.01
80 5.1 26.01
79 4.1 16.81
78 3.1 9.61
74.9 -5.68434E-14 142.9


Notice that everything up to this point is just as you are used to.

However, when we divide the sum of squared differences (142.9), we must make some of those adjustments!

When we estimate, we must adjust!

Because we took samples, and did not get information from the whole population, there is going to be some error in our variance and standard deviation if we just divide 142.9 by the number of means (10). This is where "n-1" comes into play! We have 10 means, but we can subtract 1 as an adjustment that will make the variance and standard deviation closer to the "true" population variance and standard deviation.

There will be a quiz in a minute, so be sure to take note:

FOR SD OF OBSERVATIONS IN A SAMPLE, DIVIDE BY N
FOR SD OF MEANS IN A DISTRIBUTION, DIVIDE BY N-1!

DESCRIPTIVE SD = DIVIDE BY N
INFERENTIAL SD =DIVIDE BY N-1!

So 142.9/10 gives variance =14.29
BUT 142.9/9=15.88

***Some of you will be asking, "How do we know that dividing by n-1 gives us a better estimate of the "true" population standard than dividing by just n." Here is proof:

Option 1




Option 2, think of this as more STATS MAGIC! 


If you have all elements of the population, divide by n, if not, divide by n-1!

Most students prefer Option 2...***

Now that we have 142.9/9=15.88, we know the variance.

We can take the square root and have standard deviation! sq rt of 15.88=3.98!

Now you try...

{quiz}

Hopefully that quiz went well. If not, don't worry, you will get more practice!

Means have a distribution

Some of you will have caught on by now, but sample means have a distribution just like observations in a sample! Remember the bell curve?

z-score normal distribution

It still works! It is mostly a theoretical distribution, but if you took all possible samples of a given size, the sample averages would make this same shape! Most would be close to the "true" population mean, right in the center. Few would be extremely larger or smaller than the "true" population mean. But everything holds true for the average of samples in a theoretical distribution of all possible samples compared to observations within a sample. The standard deviation is computed (almost) the same way (just remember n-1 instead of n!), the standard deviations still correspond to their same respective area under the curve at each point. 

Now is a good time to memorize some of those main areas:

Area between -1 and 1=68.2%
Area between -2 and 2=95.4%
Area between -3 and 3=99.6%

Take a second to commit them to memory, then try a quiz!

{quiz}

How did that go? You will definitely be seeing more of those in the future so be sure to memorize them if you didn't get them correct on the quiz!


Population parameters:

The theoretical distribution of sample means has a standard deviation, but because it is a theoretical distribution, we almost never know what it "truly" is. We estimate it like this: standard deviation of your sample divided by the square root of the number in your sample. 

In stats language: s/(√n)

{quiz. How do you write: standard deviation of your sample divided by the square root of the number in your sample in Stats language?}. 

Great! This standard deviation also has a special name "standard error"

Standard error=standard deviation of the theoretical distribution of all possible samplesstandard deviation of your sample divided by the number in the sample=s/(√n)

≈ (mean that it is roughly equivalent. In this case, it is an estimate!)

Conclusions

  • When we have a really big population, and it is not feasible to get information from every single person, we can take a sample and use it to estimate the true population mean!
  • We can take a sample of a given size and compute the mean of that sample in the same way we have been computing the mean all along!
  • The larger the sample, the more likely it will be to be a close estimate of the "true" population mean.
  • We can compute Z Scores for sample means the same way we computed Z Scores for observations in a sample--the only difference is that we divide the sum of squared variance by n-1 instead of just n. That helps make it a more accurate estimate of the "true" population mean.
  • There is a theoretical distribution of all possible random samples of a given size. It has the standard normal shape with standard deviations that correspond to the same respective areas under the curve as the standard normal distribution you have learned about already!
  • The theoretical distribution of all possible random samples of a given size has a standard deviation that is almost never known! You can estimate it by computing the standard deviation of your sample and dividing by the square root of your sample size! This standard deviation has a special name=standard error!

No comments:

Post a Comment