Monday, April 20, 2015

Z SCORES FOR OBSERVATIONS IN A SAMPLE

OVERVIEW:

In this unit, we will discuss how to calculate and use Z scores for observations in a sample or population. By the end of this lesson you will better understand how to use Z scores. These are the main points we will cover:
  • Z scores tell us how far from the mean an observation is in standard deviations. 
  • If you know a Z score for an observation, you can look up the corresponding percentile based on the area under the curve.
  • You can turn raw scores to Z scores by finding the raw distance from the mean, and then dividing by standard deviation. (raw-mean)/standard deviation
  • You can turn Z scores into raw scores by multiplying Z times the standard deviation and adding that to the mean. (Z * standard deviation)+mean.


Odds are you have taken a standardized test at some point, like the SAT or ACT. At some point, you see your score in a percentile instead of the actual score. Percentiles are useful because they let us compare from one kind of thing to another.

Some people take the ACT in high school but others take the SAT. While certain colleges may only take one or the other, many will take either. Each test has a different scoring system so it would be like comparing apples to oranges when deciding which students to admit. The percentile ranking gives them some basis for comparison. For example, one student may have scored in the 95th percentile on the ACT and another in the 90th percentile on the SAT, so it seems like the first student performed better, even though we could never know that based on the raw scores alone because of the way the tests use different scoring systems.

A Z SCORE DOES THE SAME THING!

Z scores "standardize" how big or small a raw score is compared to all the other scores. Your percentile means that you score higher than that percent of people.


UNIT 1

A Z Score is: THE NUMBER OF STANDARD DEVIATIONS AWAY FROM THE MEAN.

Just like that sounds, a Z score of 1 means that the observation is 1 standard deviation from the mean, a Z score of 2 means that the observation is 2 standard deviations from the mean and a z score of 1.28695 means that the observation is 1.28695 standard deviations from the mean. 

Now you try...

  1. Bill is 182 cm tall which is 1.2 standard deviations from the mean. What is the Z score?

  2. 182
    218.2
    1.2
    1.82
    None of these

  3. A leaf is 22 cm across--a Z score of 2.9. How many standard deviations is this from the mean?

  4. 22
    63.8
    2.2
    2.9
    None of these

  5. Fahid scored 11 on a test. The average was 10 and standard dev. was 1. What is the Z score?

  6. 11/1=11
    11/10=1.1
    11-10=1/1=1
    Not enough information
    Too much information

  7. Sara's shoe size of 11 has a Z score of 2. The average shoe size is 10. Compute stand. dev.:

  8. 11-10=1/2=0.5
    11-2=9/10=0.9
    11/10=1.1
    10+2=12/11=1.09
    Not enough information

  9. Will you ever get the hang of this?

  10. Yes!
    Yes!
    Yes!
    Yes!!
    YES!!!!!!!!!!


How did the quiz go? The idea is that Z SCORE = THE NUMBER OF STANDARD DEVIATIONS FROM THE MEAN!


UNIT 2


So if you know that a score was 32 points from the mean, and the standard deviation is 32, this score is 1 standard deviation from the mean. Because Z score means "the number of standard deviations from the mean", this score of 32 is not only 1 standard deviation from the mean, but we can also say that it has a Z score of 1!

Likewise, if we know that the average IQ is 100, and Michael's IQ of 115 has a Z score of 1, we now know the standard deviation--BECAUSE Z SCORE MEANS "THE NUMBER OF STANDARD DEVIATIONS FROM THE MEAN!" Because Michael's IQ of 115 has a Z score of 1, we know that a distance of 15 points from the mean (115-100) is a standard deviation of 1! 15 points from the mean is not only a Z score of 1, but 1 standard deviation from mean--BECAUSE Z SCORE IS THE NUMBER OF STANDARD DEVIATIONS FROM THE MEAN. 

Z score and number of standard deviations from the mean are the same thing!

NOTE OF CAUTION: Z score and number of standard deviations are NOT the same thing! Let's hear that again: Z score and number of standard deviations are NOT the same thing! It has to be number of standard deviations from the mean! 

FOR EXAMPLE: We know from the example above that standard deviation is 15. But a Z score of 2 is not 30 (2*15), but 130 (the mean of 100 + 2*15). 


So, why is this useful?

As mentioned above, Z scores can be used in a similar way to percentiles--to compare observations with different raw scores. 

This is because each Z score goes with a certain percentile. There  is a catch though. When we compute a Z score, we are assuming normal distribution of our data, which means that most cases are in the middle with fewer and fewer as we taper off to either side. This is shown by the red dotted line in the picture below:

z-score normal distribution



See how there are the most cases in the middle and there are fewer and fewer as you taper off to either side?

This is covered in more depth in the lesson on distribution. (Review it now if needed). 

The green line shows the mean. The other colored lines show the different standard deviations. Now for the first time, we can really see what is "standard" about the standard deviation--the raw distance from one standard deviation to the next is the same! 

Remember that in the IQ example we said that standard deviation was equal to 15 IQ points and the mean was 100. So in that case the green line is at IQ=100. 1 standard deviation (the blue line) is at IQ=115 (100+15); 2 standard deviations (the golden line) is at 130 (100+15*2); and 3 standard deviations (the brown line) is at 145 (100+15*3). Forget about the negative numbers for now...

In short, in the IQ example, the green line (the mean) is at 100, and there is a space of 15 IQ points from one line to the next. 

So in this dataset, 15 is our standard, and "deviation" means how far away we are from something (in the case of Z scores--how far from the mean). 

So standard deviation literally means a unit of fixed size away from the mean. But the size of that unit varies from one dataset to another. You have to figure out the raw distance of your standard deviation for each new dataset, but from there, it will be your "standard" way of referring to Z scores and percentiles. 

The reason is because of calculus. Even though the thickness of our red dotted line curve (the distribution) changes at each point along the distribution, we can use calculus to know what percent of the total area falls under the curve at any given point. This lesson assumes that you are not required to understand the calculus behind that, and in fact, you don't necessarily "need to". This is because those areas are always the same for any given Z score in a normal distribution. 

Let's keep it simple for now though...

Let's back up and review the major points so far before we go on:
  1. Z score means "the number of standard deviations from the mean"
  2. The standard deviation in one dataset is different than that of another
  3. The thing that is "standard" about standard deviation is that it is your "ruler" for any given dataset. That set distance from the mean can tell you the percentile of any observation in the dataset
We have covered point 1 fairly exhaustively at this point, so let's elaborate on point 2 and 3 a little. 

2. For IQs, the mean is 100 and the standard deviation is 15. Those are national averages. But let's go back to the standardized test example we mentioned at the beginning of the lesson. 

The mean on the SAT is usually around 1500, with standard deviation of 250. The mean for the ACT is around 20.8 with standard deviation around 4.8. 

So the SAT and ACT have very different standard deviations (250 and 4.8), but this makes sense once we delve into point #3.

3. The standard deviation is set at the point where 34.1% of the cases are between the mean and that point. So for the SAT, a standard deviation of 250 means that 34.1% of test takers scored between the mean, and the mean + 250 (1 standard deviation). So if you score a 1750 on the SAT, you outscored 34.1% of everyone with an above-average score. This will obviously be different for the ACT. On the ACT, you would accomplish the same feat of outscoring 34.1% of those with above-average scores by getting a 25.6 (mean plus 1 standard deviation=20.8+4.8=25.6). 

So a student with an SAT score of 1750 scored in the same percentile as a student with a 25.6 ACT score. NOW WE CAN COMPARE APPLES TO ORANGES (in a way...)!

NOTICE, THOUGH THAT THE CHANGE FROM 0 to 1 standard deviation in terms of percent  is NOT the same from one standard deviation to the next! This is simply because the curve is getting thinner and thinner, so going out 1 standard unit covers less and less area as you move toward the extremes (edges). 

Let's do some examples before we move on to the 2nd quiz!


Use this picture to visualize the examples below.
area under curve z scores

Let's pretend the picture above shows the distribution of IQ scores in the United States.  
The green line is 100, because that is the mean.

Each colored line is 15 IQ point, because that is our "ruler" or "standard" for this dataset. We know that because 15 is the standard deviation for IQs in the United States. In other words, 34.1% of people have an IQ between 100 and 115. (We will show how to calculate this later...)

Now you try. Use the picture above to answer these questions...

  1. What is the IQ of someone at the point of the brown line to the LEFT of the mean (-3 sd)?

  2. 100-3*15=55
    100-15=85
    3*15=-45
    -1*15=-15
    None of these

  3. What is the IQ of someone at the point of the golden line to the LEFT of the mean (-2 sd)?

  4. 15*2=30
    100+15*2=130
    -2*15=-30
    100-15*2=70
    None of these

  5. What is the IQ of someone at the point of the blue line to the LEFT of the mean (-1 sd)?

  6. -1*15=-15
    100-15*1=80
    100+15=115
    15*1=15
    None of these

  7. What is the IQ of someone at the point of the blue line to the RIGHT of the mean (1 sd)?

  8. 100-15=85
    1*15=15
    100-15*2=70
    100+15=115
    None of these

  9. What is the IQ of someone at the point of the brown line to the RIGHT of the mean (3 sd)?

  10. 100-3*15=55
    100+2*15=130
    100+3*15=145
    2*15=30
    3*15=45


How did you do? If you missed any answers, consider reviewing the major points and taking the quiz again before moving on. 

So, we know that a standard deviation is the raw value from the mean that gives us 34.1% of the area under the curve. (Again, we are not reviewing calculus here so you will have to take my word for it...). 

We also know that whatever that raw score ends up being (for IQ it is 15) is our "standard" for measuring how far things are from the mean. We can also use this to compute percentiles. This is how we do it:


HOW TO COMPUTE PERCENTILES:

Look again at the percentages in the picture. 

percentages for z scores


Let me point out something that is very useful: The TOTAL area is 100%, and either half is 50%. This means that we can use the other percentages to figure out percentiles. For example, to compute the percentile of a score 2 standard deviations from the mean, start with 100% (the total area) and subtract the percentage of people with scores greater than 2 standard deviations (2.1% between 2 and 3 s.d. and another 0.1 beyond 3 s.d.). So we have 100%-2.1%-0.1%=97.8%. So if your score is 2 standard deviations from the mean for an exam, you scored in the 97.8 percentile! Good job! 

You can also use 50% instead of 100% to do this. So, if you scored 2 standard deviations from the mean, you outscored the 50% in the left half of the distribution, but also the people between the mean and 1 s.d., and the people between 1 s.d. and 2 s.d. So we have 50%+34.1%+13.6%=97.7%. This tells us that you scored in the 97.7th percentile! This is the same answer we got by subtracting from 100, except for some small rounding error caused by only giving the areas out to the tenths place. 

Now you try:

Remember not to worry too much about rounding; just choose the closest answer!

  1. What is the percentile of someone whose test score is -2 standard deviations from the mean?

  2. 50-2.1-0.1=47.8 percentile
    100-2.1-0.1=97.8 percentile
    2.1+0.1=2.2 percentile
    Not enough information
    Too much information!

  3. What is the percentile of someone whose test score is 3 standard deviations from the mean?

  4. 100-0.1=99.9 percentile
    50-0.1=49.9 percentile
    0.1 percentile
    Not enough information
    Too much information!

  5. What is the percentile of someone whose test score is -1 standard deviations from the mean?

  6. around the 34.2 percentile
    around the 15.8 percentile
    around the 84.1 percentile
    Not enough information
    TMI!

  7. What is the percentile of someone whose test score is -3 standard deviations from the mean?

  8. around the 0.1 percentile
    around the 49.8 percentile
    around the 99.9 percentile
    Not enough information
    TMI!

  9. What percent of people scored between -1 s.d. and 2 s.d.?

  10. around 97.7%
    around 72.3%
    around 81.1%
    Not enough information
    TMI!


Once again, be sure to review anything you missed before moving on!

UNIT 3

If you have passed all the quizzes to this point, congratulations, we are ready to get a little more advanced!

So you may be thinking, my IQ is 128, not 100, 115, 130 or any other number e have talked about--how do I know my Z score and associated percentile? 

It is actually painlessly easy! (Sorry to disappoint you...). Remember to think of standard deviation as our "ruler" for our dataset. So, in the case of IQs, the "ruler" is 15 IQ points long, and we have discussed the percentiles associated with 1 ruler, 2 rulers, 3 rulers, and even the negative versions of those. BUT WE CAN HAVE PARTIAL RULERS TOO! So if your IQ is 122.5, you know that you are 1.5 "ruler" lengths from the mean (15+7.5=22.5. 100+22.5=122.5). So how would we calculate your Z score (how may standard deviations you are from the mean) if your IQ is 128? 


How to calculate Z score:


  1. First, find your distance from the mean: 128-100=28, so you are 28 points from the mean.
  2. Now find out how many standard deviations there are in your distance from the mean. In this case, we just divide 28 by 15 (the standard deviation). So 28/15=1.867.
  3. YOU FOUND THE Z SCORE!
It is just like converting feet to inches, miles to feet--you just divide! You are just finding out how many standard deviations are in your difference from the mean.

In terms of an equation, you may see something like this in your textbook:

Z=(X-Xbar)/s

TRANSLATION: "Subtract the mean from the raw score, then divide by standard deviation".

CAVEAT: MAKE SURE TO DO RAW SCORE - MEAN, NOT MEAN - RAW SCORE OR YOU WILL GET THE OPPOSITE SIGN (negative vs. positive) AND A VERY WRONG (OPPOSITE) ANSWER!

You can also turn a Z score into a raw score this way:

Raw score=s*Z+Xbar

In our example above, we know a Z of 1.867 goes with a raw score of 128. We can see this using the equation:

128=15*1.867+100.

Because the Z score is really the number of standard deviations from the mean, both equations make sense. If you want a raw score from a Z score, you just multiply Z (the number of standard deviations from the mean) by the standard deviation, which is a raw score. Here, Z of 1.867 means "the number that is 1.867 standard deviations from the mean". So 15 (the standard deviation) *  1.867 gives us 28. In other words, 28 is 1.867 "fifteens" from the mean. We then add that to 100 to get 128. 

Now you have a go...

Suppose that the average number of points per game for professional basketball teams is 96, with standard deviation of 12 points. Based on this information, answer the questions below:

  1. What percent of teams scored between 84 and 108 points per game?

  2. 34.1%
    84.1%
    18.7%
    68%
    None of these

  3. What is the Z score for a team that scores 112 points per game?

  4. 1.3
    4.5
    0.8
    1.5
    None of these

  5. What is the raw score that corresponds to a Z score of 3.4?

  6. 137
    145
    108
    99.4
    84.2

  7. What is the Z score for a team that scores 87 points per game?

  8. 0.75
    0.3598
    -1.28
    1.01
    -0.75

  9. What is the raw score that corresponds to a Z score of-0.54

  10. 68.73
    89.52
    133.04
    102.48
    140.05


How did you do? If you scored 5/5, you probably have a pretty good understanding of Z scores! If you scored below 4/5, you may need to go back for some review!

Finally, if the Z score you are interested in is not 1, 2, 3 or -1, -2, or -3, like in this picture:


z score distribution

you can still find the percentile by using a table like this one from UT Dallas. You just go down the left column until you find the score to the tenths place, and then over to the number in the hundredths place. So if your Z score is -2.67, you go down to -2.6, and then scroll across to the number in the 0.07 column. This table gives the area under the curve (to the left of the Z score) so Z of -2.6 is greater than .0038, or .38% of the observations. You could also say this is the 0.38 percentile. That's not where you want to score on the ACTs!



Summary
Often, it can be useful to compare similar things that have different "scoring" systems. Percentiles are one way to do that. Z scores are another, similar way. 


  • Z scores tell us how far from the mean an observation is in standard deviations. 
  • If you know a Z score for an observation, you can look up the corresponding percentile based on the area under the curve.
  • You can turn raw scores to Z scores by finding the raw distance from the mean, and then dividing by standard deviation. (raw-mean)/standard deviation
  • You can turn Z scores into raw scores by multiplying Z times the standard deviation and adding that to the mean. (Z * standard deviation)+mean.
GOOD WORK! YOU NOW HAVE THE TOOLS TO USE AND UNDERSTAND WHAT Z SCORES MEAN! For more on Z scores, check out the lesson on finding Z scores for a sample mean...

No comments:

Post a Comment