Friday, October 14, 2016

Hypothesis testing--"Statistics on trial"

Preamble

You may remember the original Karate Kid movie where a wise and aged Mr. Miyagi takes a young and unseasoned Daniel-san under his wing to teach him karate!

Mr. Miyagi puts Daniel-san to work sanding the floor, painting the fence, painting his house and waxing his cars. When Daniel-san starts to complain that he has learned nothing about karate, that he is only Mr. Miyagi's slave, Miyagi unleashes a barrage of karate attacks against which Daniel-san is able to defend himself using the motions from the chores he has been doing around Miyagi's house.

Image result for sand the floor


It is done in a way that can only exist in Hollywood. Still, it is hoped that in this learning module (if you have been diligent in the previous modules) you will have an experience something like the "sand the floor, paint the fence" scene from Karate Kid.

It might even be a good idea to take a 3 minute break and watch the clip from your favorite online video source! Come back when you feel inspired, or continue if you already feel inspired.

What is hypothesis testing?

Over the next few weeks, we will learn about hypothesis testing. This is intended to be a very short introduction to just the basics. More detail will come later. Just focus on these key terms and ideas and then let it set up over the Fall Break!

To begin, Daniel-san learned to sand the floor, paint the fence, wax on wax off, etc. What are the equivalent tools that you have in your arsenal?

  • General knowledge about Z Scores
  • Finding Z Scores for a sample mean in a theoretical distribution (just like Z Scores for observations in a sample except we divide by STANDARD ERROR instead of standard deviation!)
  • Finding confidence intervals (finding two Z Scores that contain a certain percent of the distribution)
You are ready to go!

The main idea of hypothesis testing is that you can use these tools with your data to make a point, but we have to play a little game rooted in philosophy. (Yes, many statistical concepts originate in philosophy). 

Here's an example: 

A local organization claims that children are spending 3 hours per day on screentime. This seems a little high to you, so you want to test it. 

So, here is the game we play (first in English, then in Stats Language):

ENGLISH:

Ok...you think children spend 3 hours per day on screentime? I challenge this assertion! I challenge the status quo. 

So, you say screentime is 3 hours per day. 

I don't think it is. 

I am willing to risk a 5% chance that I am wrong but I want a 95% chance that I am right. 

I took a sample of 300 children and the average is only 2.3 hours. The standard deviation (not standard error!) of the sample is 1.8 hours. Let's compute a Z Score for my sample average to see how likely it is that to get a sample mean of 2.3 hours. If fewer than 5% of sample means are this far away from the proposed 3 hour number, I am more than 95% confident that I am right and I am willing to reject the 3 hour number!

STOP AND THINK ABOUT ALL OF THAT UNTIL YOU FEEL CONFIDENT ABOUT IT IN ENGLISH. THEN MOVE ON TO THE STATS LANGUAGE INTERPRETATION:

STATS LANGUAGE:

H0: μ = 3 hours
H1: μ ≠ 3 hours

α =.05

Xbar = 2.3 hours
s = 1.8 hours

That's it! And this is hypothesis testing!

Hopefully you just had a "Miyagi moment", but if not, stop and think about it for a moment. 

We are simply using what we know about distributions and Z Scores to put some assertion on trial!

H0 means "this is the thing on trial". It is usually the status quo because it is something we want to debunk! NOTE! NOTE! NOTE! It contains a statement of equality! 

H0 is called the "Null hypothesis" and null means "nothing". It is the statement of "no difference" and no difference means equal! 

It is the statement of what is! (Just remember you are putting this statement on trial and you have to know what something is in order to put it on trial). 



Just remember this cartoon. You have to know what something is in order to put in on trial! H0 is on trial and must contain an "equal" statement. Usually, the population mean is (equals) some number. OR is equal or less than (more than) some number. 

H1 is the evidence against H0! The prosecution, if you will! You have to find enough evidence that H1 is not like H0 in order to convict it. 

α is out tolerance for making a mistake. You know how in court, one is innocent until proven guilty? This is to avoid calling an innocent person guilty! Mistakes happen though, but we can focus on avoiding the mistake we do not want to make. 

There are two kinds of mistakes:

  • Calling a guilty person innocent 
  • Calling an innocent person guilty

In court, we want to avoid calling an innocent person guilty, so we avoid it in the way trials are set up: You only give a guilty verdict if you are sure beyond a reasonable doubt.

In statistics we have two kinds of mistakes:
  1. Rejecting H0 when it really is correct
  2. Failing to reject H0 when it is not correct
In statistics, we want to avoid the first one. It is called Type I error. It goes with our alpha level (α)! 
So, when we said α=.05, it actually means, we will only reject H0 if we are satisfied that there is only a 5% chance or less that we are rejecting the true statement. We want to be 95% confident that we have not made a mistake! 

So, in stats, we want to avoid the Type I error. 

This means that sometimes we fail to reject H0 when we should have. That is why our wording must always be like this (repeat after me):

We reject H0
OR 
We fail to reject H0.

There is no other conclusion in hypothesis testing. 
There is no other conclusion in hypothesis testing. 
And, finally, there is no other conclusion in hypothesis testing. 

In court, there is only "guilty" or "not guilty". 
In hypothesis testing, there is only "reject H0" or "fail to reject H0".

There is no "innocent" verdict in court and there is no "accept H0" verdict in hypothesis testing. 

Example

Because examples are an effective way to learn statistics, we will end this shorter week with an example. Next week, we will resume and get into greater detail. 

This example shows how to do a hypothesis test. We will use the numbers from the screentime example. 


H0: μ = 3 hours
H1: μ ≠ 3 hours

α =.05

Xbar = 2.3 hours
s = 1.8 hours

Standard Error=1.8/sq root of 300
1.8/17.32=.10

Test: (Xbar-μ)/S.E.

(2.3-3.0)/.10 = -.7/.10 = -7.0

What is -7.0? Think of it as a Z Score (it is technically a "t" score, but more on that next week). 

If you use the Math Is Fun Z Score tool, you will see that -7.0 is literally off the charts! It is out there somewhere, the tail of the distribution has just thinned out so much, there is no need to include it in most charts. In fact, only 0.13% of sample means are expected to fall below a Z Score of -3.0. So, even with a Z of -3.0 we would have been 99.87% confident that we would not see a sample mean that low with a true mean of 3.0 hours. 

With a Z of -7.0, we are greater than 99.999999% confident that we would not see a sample mean as low as 2.3 hours if the true mean is really 3.0 hours. Look how thin the tail is at that point. It almost looks like 0 to the naked eye, and it basically is 0 it is so small (smaller than 0.000001).



So, what do we conclude? 

We reject the null hypothesis (H0) that μ = 3 hours, in support of the alternative (H1) that the real mean is not equal to 3 hours!

We call our risk of making a Type I error, alpha.

We call the probability that we have made that error our p value. (p for probability of the Type I error).

So, with a Z Score of -7.0 we have a p value less than 0.000001%. It is less than the alpha (tolerance) level of 5%.

How do we remember this? With a small poem:

"When p is low, reject the null!
When p is low, reject H0!"
 

Now you try a quiz. After that, go enjoy your Fall Break!



{quiz}


Friday, October 7, 2016

ESTIMATION

INTRODUCTION

This topic usually gets a little tricky for students, so don't get worried if it doesn't all "click" right away. An effective way for many students is to think of this unit in parts:

  1. I know how to compute Z Scores in general (if not, review the unit on computing Z Scores for an observation in a sample).
  2. I used Z Scores to find the area under the standard normal curve that was associated with a certain score. 
  3. Now, I can use Z Scores in a similar way to find the area under the curve that is associated with a certain sample mean.
  4. A major key is: If I don't know the "true" population mean, I can take a sample, compute the mean and use it to estimate the "true" population mean!
Let's do #4 again! It is just that important:

If I don't know the "true" population mean, I can take a sample, compute the mean and use it to estimate the "true" population mean!
THIS IS ONE OF THE GREATEST THINGS IN STATISTICS!

It truly is! Think of it-- If you want to know the height of all 320 million Americans, you can take a sample of 300 people and be able to estimate the average from that sample of 300!

You might be thinking:

"Dr. Oliver, this is too good to be true! There must be a catch!"

There is...But not much of one. The catch is that our sample needs to be large enough. How large is "large enough", you ask? It depends on several things, but we will focus on SAMPLE SIZE! 

For the interested and highly-aspiring student (like you are!), homogeneity of the population is also a big deal! If your population of 320 million is EXACTLY the same, you only need a sample of 1 person to know everything about the population! If they are all completely and utterly different, you need a sample of 320 million to really know about the population.*

As you can see, the level of homogeneity ("sameness") and sample size go hand in hand! 

We will not go into depth in this class on how to use homogeneity to determine a proper sample size (though it is a fascinating topic!). The point that you should appreciate is that we have 320 million unique individuals in the United States, so unless we sample them all, we will probably not get exactly the right information! 

TRUE!

This is where estimation comes in! When we take a sample of 300 and find the mean of that sample, it will almost never be exactly equal to the population mean. 

Think of it. Suppose the average number of hours spent by Americans playing video games per week is 2.5. In fact, we are rounding and the number is probably really something like: 2.5946713496838495698182933409485920948589209194890301038902... you get the idea...


  • What are the chances that my sample of 300 people will also have a mean of 2.5946713496838495698182933409485920948589209194890301038902...? Zero. Not gonna happen...
  • What are the chances that my sample  of 300 people will have a mean of 2.5? Pretty high.
  • What are the chances that my sample  of 300 people will have a mean of 2.4321 or 2.69784? Also pretty high. In fact, more likely than not! 
MOST SAMPLES WILL BE SOMEWHERE NEAR THE "TRUE" POPULATION MEAN!

NO SAMPLE WILL BE EXACTLY THE POPULATION MEAN. 

This is why some statisticians have been heard saying something that is probably incomprehensible to the rest of the world: 

"No one is average." 
Let's take a quick quiz to see if you know why statisticians say this. 

{quiz: Based on what we have just learned why do statisticians say: "No one is average?" A. Because, while most people will probably fall near the mean, no one will be exactly the mean. B. Because of the law of large numbers. C. Because of the individualistic culture in America. D. Statisticians are wrong. Most people are average. }

Nobody is Average- a normal distribution curve with figures inside it and DNA as the curve
A graphic from a CDC.gov blog talking about how no one is average. https://blogs.cdc.gov/genomics/2014/07/02/nobody-is-average/

We will leave the argument right there about whether or not anyone actually is average, because the only point you need to take away is that, if we took a random sample of 300 people and did so 1,000 times, none would be the same, and none would match the population average. 

For the student that likes hands-on learning, go back to my simulation and try it! Look at the average for the sample of 300 and draw new samples over and over until you are convinced that none will be the same...

{File Insert: Sample Simulator}

This unit is divided into three parts:

I. Review of Z Score calculations
II. Review of differences if you want to estimate a population parameter
III. "How good is the sample?" An introduction to confidence! 

REVIEW OF Z SCORE CALCULATIONS

Let's start of right away with a quiz!

{Quiz: Z to raw, raw to Z revisited! 

Suppose Darrell is a pretty fast runner, but only feels better about himself if he is faster than other people [for shame, Darrell :( Try running for a PR instead!]. Darrell can run a mile in 5 minutes and 59 seconds, which is exactly 10.0278552 miles per hour (remember you can round to 2 decimal places!). If the American average mile time is 5.217391932455 miles per hour, and the standard deviation is 2.7896415893 miles per hour, then Darrell is faster than what percent of Americans?

Remember, you can use the Math Is Fun Z Score to area tool: https://www.mathsisfun.com/data/standard-normal-distribution-table.html

A. 95.73%
B. 45.73%
C. 1.72%
D. 4.81%

SOLUTION: 

Remember, Darrell =10.03, Mean=5.22, sd=2.79. 

1) Darrell is 4.81 mph faster than the mean (10.03-5.22=4.81). Z Score IS the number of standard deviations from the mean, so turn 4.81 into standard deviations=4.81/2.79=1.72. 

2) 1.72 is the number of standard deviations that Darrell's speed is from the mean, and Z Score IS number of standard deviations, so Darrell's Z Score is 1.72. 

3) Now, find the area BELOW 1.72 ("Up to Z") on the standard normal table.}

Now you have a review of the basics. Try another quiz with three more problems.

{Quiz: Z to raw, raw to Z: Practice makes perfect.

Open this link in another window or browser: http://www.espn.com/nfl/statistics/team/_/stat/total
The link will connect you to a table of data for all 32 NFL teams! 

Use the "YDS" column (total yards for the NFL team) to answer the following questions. The mean for total yards "YDS" is 1410.969 and the standard deviation is 192.8788 (you may round to two decimal places):

1. What do the individual numbers under the "YDS" column represent if our population is "all NFL teams"?

A. Observations in a population
B. Observations in a sample
C. Sample means in a theoretical distribution
D. Means in a sample.

F/B: These are observations in a population. Our population is all NFL teams and, this is a table of "all 32 NFL teams" as mentioned below the hyperlink!

2. What percent of teams have more yards than San Francisco? (USE Z Scores! Do not count by hand!)

A. 40.52%
B. 23.9%
C. 9.1%
D. 59.1%

F/B: Here we have the same thing as the last quiz, just different numbers! 

San Francisco is 46.03 yards above average (1457-1410.97=46.03). Divide by s.d.=46.03/192.88=.238. That is the Z Score. Find AREA BEYOND Z ("Z ONWARDS") in the Math Is Fun table and that is the answer!

3. What percent of teams have yard totals that fall between those of Pittsburgh's and Denver? 
A. 26.07%
B. 45.0%
C. 87.88%
D. 7.8%}

The last question takes a little creativity in combining concepts you have already learned! Let's call Pittsburgh's yardage (1498) the UPPER score and Denver's (1369) the LOWER score. We want the area between. RULE #1: DON'T PANIC! YOU CAN DO THIS!

Step 1: Compute the UPPER Z Score
Step 2: Compute the LOWER Z Score

Those should be pretty straightforward. 

Step 1: 1498-1410.97=87.03. 87.03/192.88=0.45. Z=0.45.
Step 2: 1369-1410.97=-41.97. -41.97/192.88=-0.22. Z=-0.22.

Now, this is where many get stuck. There is no "Area between two Z Scores" on the Z Score finder! Here is where that creativity comes in. It can best be facilitated by drawing pictures! Students that use this picture drawing technique improve their ability to do statistics over the course of the semester by 3 times as much as those who do not! 

Here is the picture drawing strategy:




















Now go back to your Z Scores and sketch them onto the picture too! 


Hopefully you can see that we now have something we can work with! Specifically, the area between the mean and -.22 + the area between the mean and .45 will give us our shaded area!


 So, use the Z Score finder from Math Is Fun to find each area and you get: 17.36 for area between mean and Pittsburgh, and 8.71 for the area between the mean and Denver. (Notice, there is no negative area! Area is area--just in different directions!)

17.36+8.71=26.07 and that is the answer! 26.07% of teams have total yardage between that of Denver and that of Pittsburgh.

Notice there are two ways to do the same thing:














Either works, but there are benefits to learning to use both. If you learn both, you will have more tools for quickly finding areas of interest. Students that learn both also tend to better understand what it means to find a certain area beneath the curve. 

Differences for population estimate versus sample statistics

In the last unit, you learned that we can use this same idea of computing Z Scores when we want to know how good of an estimate our sample mean is of the "true" population mean. 

You may also remember that we had to make some adjustments. 

When we don't know the population parameters, we must adjust!

So, we know that we find the standard deviation of an entire population like this: 


Do you recognize that? If not, pull out your Statistics dictionary! 


In short (or long): "Subtract the population mean from each observation of interest, square it, add it all up and divide by the number in the population"! 

They are your old friends, the columns! 

  • Column 1: X (the observations of interest)
  • Column 2: μ (we were calling it "X bar" or the mean)
  • Column 3: X-μ (observations minus the mean)
  • Column 4: (X-μ)2  (Square the differences)
  • Column 5: (X-μ)2/N (Divide by total number of observations)

This was variance!
Now take the square root: √(X-μ)2/N

“Holy Z Score Batman!” You have known "√(X-μ)2/N" all along!
What a statistician you are becoming. I like to think that this is your proudest day in statistics! If not, perhaps it should be! It means that you understand your first awesome equation and can impress your friends and colleagues. 

Suppose you are in a research meeting and someone that is very statistically-minded sends you an email that says: "Hey, {fill in your name here}. Can you run a quick √(X-μ)2/N for me on that dataset?" 

You will now reply: "Sure. NP. ("No Problem" in my generation...)" :)

{Quiz: What is: σ= √(X-μ)2/N? 
A. Standard deviation of the population
B. Standard error
C. Standard Operating Procedure
D. The name of a Biology Textbook in Athens}

It is Stats Language for "the standard deviation of the population!"

Adjustments:

Now that you are have the equation fresh in your mind, let's turn back to the adjustments we need to make when we do not know the population parameters! 

When we do not know the population parameters, we must adjust.













The reason we must adjust is because when we do not have the whole population, we can only estimate. Statisticians have spent years finding ways to make the estimates better!

The first adjustment is (n-1). 

  1. When we take a sample of the whole population, we must use (n-1) instead of n!
So, for σ (population standard deviation) we have: √(X-μ)2/N.


For s (sample standard deviation) we have √(X-Xbar)2/(n-1).

Study up for a moment and then take an quiz!

{Quiz: Matching.

σ, s

AND

√(X-μ)2/N, √(X-Xbar)2/(n-1)}

Notice that population parameters are often Greek and sample statistics are often Roman (letters you recognize as an English speaker). 

So, σ (lower-case sigma) is population standard deviation and s is sample standard deviation.

Xbar is the mean of a sample, but μ (myu, moo) is the population mean! 

So far, when the population parameters are not known, or you do not have every element of the population:
  • We must divide by (n-1) instead of n when computing variance and standard deviation. 
  • We use Greek symbols instead of Roman letters we know!
Now, we are ready to add just a little bit more. Go back to this point:


#1. σ (population standard deviation) = √(X-μ)2/N.


#2. s (sample standard deviation) = √(X-Xbar)2/(n-1).

Which one lets us compute the Z Score for a sample mean as an estimate of the "true" population mean?

#1. almost never exists in practice. Unless your population (the group you want to study) is a school of 200 registered students (or something similar--even a large university registers and keeps track of every student so it has a known population), you may never know the "true" population parameters. This is because you either would have a very hard time locating every member of the population or a very hard time getting information from all of them! (Review previous units on the difficulty--nay, the near impossibility--of surveying the whole nation!)

So, in practice, #1 is usually out. We used it to get us to this point, but from here on out, it will be of little use to us. Good-bye #1! 

#2. Here is a much more practical equation. We take a sample of, say, 300 people, compute standard deviation and all is well--UNLESS YOU WANT TO KNOW SOMETHING ABOUT THE POPULATION (which we do)! This only tells us about our sample!

We need something more! 

We need standard error!

You are becoming so good at Stats language that you can perhaps translate this directly into English. You can use your Stats dictionary for help: 

Standard Error=s/√n

When you are ready, proceed to a quiz:

{QUIZ: Standard Error

Standard Error=s/√n. This means that Standard Error is computed by:
A. Taking the sample standard deviation and dividing it by the square root of the number of observations in the sample. 
B. Taking the sample standard deviation and dividing it by the square root of the number of elements in the population. 
C. Taking the population standard deviation and dividing it by the square root of the number of elements in the population. 
D. Taking some time off after this course...

2 attempts allowed}

S=sample standard deviation. √n=square root of the number of observations in the sample. (Notice n=number in the sample, and N=number in the population). So Standard Error (SE) is computed by simply dividing the sample standard deviation by the square root of the number of observations in the sample. 

Now you try:

{Quiz: Standard error practice

Suppose you take a sample of all Americans. Your sample size is 300. Your sample has mean years of education = 11.3. You compute sample standard deviation and find that it is 4.6. Compute Standard Error (SE). 

A. 0.27
B. 17.32
C. 2.46
D. 8.88

F/B. It is simply sample standard deviation divided by the square root of 300! 4.6/√300.}

Standard Error is a very magical thing! Standard error is technically the standard deviation of the sampling distribution. 

Remember: The sampling distribution is a mostly theoretical distribution of all possible samples of a given size! 

Year after year of teaching statistics, a student says something like, "But, Dr. Oliver, how would I actually find the sampling distribution in practice?" 

YOU WOULDN'T! It doesn't make sense! In most cases in practice, if you have that much time and money (and omniscience), you might as well go collect information for all elements of your population and be done! 

If you like, you can think of this as "Stats Magic". Here is the "magic":

If you take a sample and compute the mean and standard error (using sample standard deviation), you can use standard error to tell you what percent of sample means would theoretically be larger or smaller than your sample mean.  

Stop and think about that for 20 seconds... 

The standard normal table (like the one from Math Is Fun) is now no longer a graph of how many observations there are at each point, now, it is a graph of how many sample means would fall at a given point. 

When we want to estimate the percent of area under the curve between two sample means in the theoretical sampling distribution, we use Standard Error in the way we used to use standard deviation!  So, Z Score is still the number of standard deviations, but now you must use standard deviation of the theoretical sampling distribution (AKA Standard Error!). 

Often the best way to see it is to try it yourself!

{Quiz: Your first estimate with SE

Suppose you take a sample of all Americans. Your sample size is 300. Your sample has mean years of education = 11.3. You compute sample standard deviation and find that it is 4.6. You also compute Standard Error (SE) and find that it is 0.27. 

What percent of samples will have a sample mean between 11.03 and 11.57?
A. 68.2%
B. 1.2%
C. 0.2%
D 2.2%}

Just like 68.2% is the percent of the area 1 standard deviation in either direction of the mean,  it is also the area 1 standard error in either direction of the sample mean!

Let's summarize what we have so far:

  • σ (population standard deviation) =√(X-μ)2/N.
  • s (sample standard deviation) = √(X-Xbar)2/(n-1).
  • SE (theoretical sampling distribution standard deviation) =s/√n
If you working ahead, you may have noticed that, by substitution, SE=
(√(X-Xbar)2/(n-1))/√n

Does that look terrifying? It just means that you compute Standard Error by taking sample standard deviation [√(X-Xbar)2/(n-1)--your old friend, the columns!] and dividing it by √n!

You are going to look very smart to your friends and colleagues very soon! 

Here is the last section for today (you have shown excellent endurance!).

These modules have alluded several times to the idea that we never know the "true" population parameters. 

This week, we have also learned that our sample mean will probably never match the "true" population mean, but will probably be close to it. 

Think of our bell curve again: 


The middle (dashed green) line is now the "true" population mean! 1Sd is now 1SE, 2Sd is now 2SE and so on...

Knowing this, we can tell how CONFIDENT (<--key word alert!) we are about the location of the "true" population mean. 

This is great stuff! Stats Magic!

Go back to Z Scores of observations in a sample. Suppose we have a classroom of 35 students with an average ACT score of 17.37 and standard deviation of those scores equal to 7.50. 

Now (AND THIS IS WHERE IT GETS GOOD!) suppose someone walked up to you and said, "Mr./Mrs./Ms. Emerging Statistician, I will bet that if I pick a random student from the class, you will not be able to guess their ACT score." 

Being an emerging statistician, you know your best guess is the average score, but you also know that most of the scores are NOT average! Guessing the average score gives you the best chance of being the CLOSEST to any randomly picked score, but it gives you almost 0 chance of picking the exact score! 

Try the "No One is Random" simulator on the next page if you want to see it in action. (Macros must be enabled).

{Next page}

Ok. You will never be able to do it. 

So, having taken Dr. Oliver's class, you get an idea and counter, "Mr./Mrs./Ms. Person That Is Challenging Me To A Bet, I will make you a new bet. We all know that your proposal is a highly unlikely wager, but I know some fancy statistics and I can give you a range that will contain the score of whichever student you draw at random from the classroom."

Blindsided by your statistical knowledge the person allows it. 

Now, suppose you want a 95% chance of getting it right, what range would you give? 

Pause to think for a minute or two. What range would you give? 

You know the mean is 17.37 and that most observations fall close to the mean. So just take the 95% that are closest to the mean!

Just take the 95% that are closest to the mean! (Yes, that was deliberately repeated for emphasis!)

Notice that 95.4% of cases fall between -2Sd and +2Sd. 

As it turns out, 95% of cases fall between -1.96Sd and +1.96Sd. You can see this yourself by using the Z Score calculator and finding 47.5% of the area between 0 and Z. (47.5+47.5 adds up to 95!). 

To do this, we multiply S by 1.96 and -1.96 then add each to the mean:

  • 1.96*7.5= 14.7
  • -1.96*7.5= -14.7
Now add (subtract) each to (from) the mean of 17.37:
  • 17.37+14.7=32.07
  • 17.37-14.7=2.67
Now, you can say that you are 95% CONFIDENT (<--key word alert!) that the student to be drawn at random will have an ACT score between 2.67 and 32.07. 

This same idea works for sample means when we want to capture the "true" population mean. 

Suppose now that you want to know the "true" mean ACT score for all Americans. You take a random sample of 150 people and ask their ACT score. The sample has an average of 18.59 and standard deviation of 6.33. Use standard error (instead of sample standard deviation) to find the 95% CONFIDENCE INTERVAL (the two numbers that are the lower and upper bound of the middle 95% of the distribution). 

First, find standard error by dividing standard deviation by √n:

6.33/√150=6.33/12.25=0.52=SE

Now we need to find 1.96 Standard Errors in either direction from the mean.

1.96*0.52=1.01
-1.96*0.52= -1.01

Then add/subtract to/from the mean:

18.59+1.01=19.60
18.59-1.01=17.58

So, we are 95% confident that the interval between 17.58 and 19.60 contains the "true" population mean!

We call this a 95% Confidence Interval.

It also means that we were not willing to accept greater than a 5% chance of being wrong...
We call that our ALPHA level. It is represented by the Greek alpha= α


Conclusions!

This was a big unit this week. You have learned:
  • σ (population standard deviation) =√(X-μ)2/N
    • Used when you have all elements of the population (RARE!)
  • s (sample standard deviation) = √(X-Xbar)2/(n-1)
    • Used when you want to compute Z Scores for observations in a sample
  • SE (theoretical sampling distribution standard deviation) =s/√n
    • Used when you want to compute Z Scores and Confidence Intervals for sample means relative to the sampling distribution
  • 1.96 "Standard Errors" from the mean in either direction gives us a 95% Confidence Interval.
  • ALPHA level (represented by the Greek alpha= α) reflects the greatest chance we are willing to take of being wrong! (We will refine that wording a little in future units!)
We will see the concepts more and you will get more practice with them in order to get very familiar with them!

Good luck as you continue on the adventurous path of statistics!
















Wednesday, September 28, 2016

Z SCORES FOR SAMPLE MEANS

CAVEAT

Hopefully everything from the Descriptive vs. Influential statistics module makes sense. If not, consider reviewing it again! We are going to take that information and combine it with what you know about standard deviation to move onward!

Let's start today with a video overview of what we will be discussing:

{VIDEO}

Introduction

In the lesson on Z Scores, we learned that Z Score is nothing scary--it just means "number of standard deviations from the mean". So, what if we have a dataset with a mean of 50 and standard deviation of 15. What is the Z Score of an observation in that dataset with the value 80? 

Take a quiz to see if you remember!

{Quiz. A: Z Score=2}

That quiz is a reminder that Z Score is nothing scary. It just means "number of standard deviations from the mean!" Sometimes we have to first figure out how far away something is from the mean, but it is otherwise quite straightforward!

"So, how can we combine this with inferential statistics?"

I'm glad you asked! We can do the same thing with sample means that we can with observations in a sample.

Observations in a Sample vs. Sample Means in a Distribution

^^^^ That subtitle is a little intense! Let's break it down into English. (As always, this is about translating from Statistics language to English--or really whatever language you are learning stats in!).

Examples usually help make the point:

EXAMPLE OF OBSERVATIONS IN A SAMPLE

This one should seem familiar. Suppose we have a classroom of 10 students (it is a small classroom). They took a stats test and got the following scores:

98, 92, 90, 95, 88, 94, 95, 100, 99, 98

(They are very good students)

If we want to know the Z Score of the first observation (98), what do we do? What we have been doing all along! 

Let's try again (just for fun!).

Start by adding up the observations and dividing by the number of observations:

Xi
98
92
90
95
88
94
95
100
99
98
94.9<--mean!

Now, subtract 94.9 (the mean) from each observation to get the difference of each observation from the mean (that is the heart and soul of variance!).

Xi Xi-Xbar
98 3.1
92 -2.9
90 -4.9
95 0.1
88 -6.9
94 -0.9
95 0.1
100 5.1
99 4.1
98 3.1
Now, add up those differences. REMEMBER, THIS IS A CHECKPOINT AND SHOULD COME OUT TO ZERO, MEANING WE HAVE SUCCESSFULLY BALANCED THE SEESAW!

Xi-Xbar
3.1
-2.9
-4.9
0.1
-6.9
-0.9
0.1
5.1
4.1
3.1
-5.6843E-14

Wait! The mean is not zero, but -0.000000000000056843! 

All is well! In Stats language, we call this "rounding error". It means that we rounded to tenths (0.1) or hundredths (0.01) so our numbers are not exact. That is fine! As long as that number is tiny, we are still OK.

*The very aspiring and astute student will usually ask at this point, "How tiny is tiny?" As a rule of thumb, if this number is less than 0.01% of the mean, you are (probably) OK! So take your mean, multiply by .0001 and if the difference is less than that, chances are it is just rounding error!*

Next, square the differences so that the summed differences don't add up to only zero!

Xi Xi-Xbar diff^2
98 3.1 9.61
92 -2.9 8.41
90 -4.9 24.01
95 0.1 0.01
88 -6.9 47.61
94 -0.9 0.81
95 0.1 0.01
100 5.1 26.01
99 4.1 16.81
98 3.1 9.61
94.9 -5.6843E-14 142.9
So, we have summed differences equal to 142.9!

Now, how do we compute variance? Divide by the number of observations (n)!

142.9/10=14.29

14.29 = variance

How do we get standard deviation? S squared = S squared! Variance is S (standard deviation) squared! So, just take the square root of variance to get S (standard deviation). 

sq rt of 14.29=3.78! This is standard deviation! 

Now, to get the Z Score--the number of standard deviations from the mean--we need to know how far the observation of interest (98) is from the mean:

98-94.9=3.1

So, how do we find out how many standard deviations 3.1 is? The same way you find out how many feet 24 inches is--DIVIDE! 

3.1/3.78=0.82

So, 0.82 is the number of standard deviations the 98 is from the mean! We can translate it like this Z Score of 98=0.82. 

So, the Z Score is 0.82! 

EASY! (?)

EXAMPLE OF SAMPLE IN A DISTRIBUTION

Now, suppose we want to know the last stats exam score for every college student in America! It just became unfeasible to collect all of those scores! It will cost too much (time, money, etc). This means you will have to take the largest sample you can and hope it comes close to the "true" population average! 

OF UTMOST IMPORTANCE--> As the video shows, choosing the largest sample will give you the best chance of getting a sample average that matches the "true" population average. This is due to the "Law of Large Numbers"and the Central Limit Theorem (CLT)! More on these later...

Think back to the video now. Imagine 100,000 test scores (I won't put a column that big in this module, so your imagination will have to do!). Let us suppose that you can afford to do a sample of 100 students. You also have taken a stats class and know that you should take a random sample to give yourself the best chance at that sample average matching the "true" population average. 

Random=All element in the population have a non-zero chance of being chosen and all elements have an equal chance of being chosen. 
Because we are taking a random sample, you never know which 100 observations will be chosen. This means that there will be some variation in the sample's average compared to other possible samples. In the video, we used a simulator to show that a sample of a given size will be a little different each time if you take repeated samples. 

This means that (drum roll...) SAMPLE MEANS HAVE A DISTRIBUTION THEMSELVES!


So, now picture that we take 10 samples of 100 (10 samples of 100 observations each) and we record the average for each sample of 100. The first has an average score of 78, the second has an average of 72, the third of 70 and so on.

Here are the averages for all 10 samples of 100:

78, 72, 70, 75, 68, 74, 75, 80, 79, 78

THERE IS GOOD NEWS AND BAD NEWS!

The good news is that we can still compute Z Scores in the same GENERAL way we have been doing! The bad news is that we now have to make some adjustments because we are not collecting all the information in the population--we are trying to estimate it.

When we estimate, we must adjust.
When we estimate, we must adjust.
When we estimate, we must adjust. 


Let's first look at what we do exactly the same way as what we have been doing:

Means Mean-Avg. of the means diff^2
78 3.1 9.61
72 -2.9 8.41
70 -4.9 24.01
75 0.1 0.01
68 -6.9 47.61
74 -0.9 0.81
75 0.1 0.01
80 5.1 26.01
79 4.1 16.81
78 3.1 9.61
74.9 -5.68434E-14 142.9

Notice that there are slight modifications to the headings because we now have Means in column 1. We still take the average of that column just like always and it gets a little awkward in the wording because we now have a mean of the means...


To slip out of that awkward wording, we call it THE GRAND MEAN!

The Grand Mean!
Here it is again, with the new title:

Means Grand Mean diff^2
78 3.1 9.61
72 -2.9 8.41
70 -4.9 24.01
75 0.1 0.01
68 -6.9 47.61
74 -0.9 0.81
75 0.1 0.01
80 5.1 26.01
79 4.1 16.81
78 3.1 9.61
74.9 -5.68434E-14 142.9


Notice that everything up to this point is just as you are used to.

However, when we divide the sum of squared differences (142.9), we must make some of those adjustments!

When we estimate, we must adjust!

Because we took samples, and did not get information from the whole population, there is going to be some error in our variance and standard deviation if we just divide 142.9 by the number of means (10). This is where "n-1" comes into play! We have 10 means, but we can subtract 1 as an adjustment that will make the variance and standard deviation closer to the "true" population variance and standard deviation.

There will be a quiz in a minute, so be sure to take note:

FOR SD OF OBSERVATIONS IN A SAMPLE, DIVIDE BY N
FOR SD OF MEANS IN A DISTRIBUTION, DIVIDE BY N-1!

DESCRIPTIVE SD = DIVIDE BY N
INFERENTIAL SD =DIVIDE BY N-1!

So 142.9/10 gives variance =14.29
BUT 142.9/9=15.88

***Some of you will be asking, "How do we know that dividing by n-1 gives us a better estimate of the "true" population standard than dividing by just n." Here is proof:

Option 1




Option 2, think of this as more STATS MAGIC! 


If you have all elements of the population, divide by n, if not, divide by n-1!

Most students prefer Option 2...***

Now that we have 142.9/9=15.88, we know the variance.

We can take the square root and have standard deviation! sq rt of 15.88=3.98!

Now you try...

{quiz}

Hopefully that quiz went well. If not, don't worry, you will get more practice!

Means have a distribution

Some of you will have caught on by now, but sample means have a distribution just like observations in a sample! Remember the bell curve?

z-score normal distribution

It still works! It is mostly a theoretical distribution, but if you took all possible samples of a given size, the sample averages would make this same shape! Most would be close to the "true" population mean, right in the center. Few would be extremely larger or smaller than the "true" population mean. But everything holds true for the average of samples in a theoretical distribution of all possible samples compared to observations within a sample. The standard deviation is computed (almost) the same way (just remember n-1 instead of n!), the standard deviations still correspond to their same respective area under the curve at each point. 

Now is a good time to memorize some of those main areas:

Area between -1 and 1=68.2%
Area between -2 and 2=95.4%
Area between -3 and 3=99.6%

Take a second to commit them to memory, then try a quiz!

{quiz}

How did that go? You will definitely be seeing more of those in the future so be sure to memorize them if you didn't get them correct on the quiz!


Population parameters:

The theoretical distribution of sample means has a standard deviation, but because it is a theoretical distribution, we almost never know what it "truly" is. We estimate it like this: standard deviation of your sample divided by the square root of the number in your sample. 

In stats language: s/(√n)

{quiz. How do you write: standard deviation of your sample divided by the square root of the number in your sample in Stats language?}. 

Great! This standard deviation also has a special name "standard error"

Standard error=standard deviation of the theoretical distribution of all possible samplesstandard deviation of your sample divided by the number in the sample=s/(√n)

≈ (mean that it is roughly equivalent. In this case, it is an estimate!)

Conclusions

  • When we have a really big population, and it is not feasible to get information from every single person, we can take a sample and use it to estimate the true population mean!
  • We can take a sample of a given size and compute the mean of that sample in the same way we have been computing the mean all along!
  • The larger the sample, the more likely it will be to be a close estimate of the "true" population mean.
  • We can compute Z Scores for sample means the same way we computed Z Scores for observations in a sample--the only difference is that we divide the sum of squared variance by n-1 instead of just n. That helps make it a more accurate estimate of the "true" population mean.
  • There is a theoretical distribution of all possible random samples of a given size. It has the standard normal shape with standard deviations that correspond to the same respective areas under the curve as the standard normal distribution you have learned about already!
  • The theoretical distribution of all possible random samples of a given size has a standard deviation that is almost never known! You can estimate it by computing the standard deviation of your sample and dividing by the square root of your sample size! This standard deviation has a special name=standard error!