Wednesday, September 28, 2016

DESCRIPTIVE VS. INFERENTIAL STATISTICS

Introduction to inferential statistics


In the 1991 hit movie "What About Bob?", psychiatric patient Bob Wiley tells his therapist something like, "There are only two kinds of people in the world, those who love Neil Diamond and those who don't."

What About Bob film.jpg
Some might call Bob's statement a "false dichotomy", but there really are only two kinds of statistics: Those that DESCRIBE and those the INFER. 








 
So far, we have been dealing with those that describe. Now, we are going to start talking about statistics that infer! (It is a great day!)

Descriptive statistics are very nice to us...they don't care about anything other than describing the data that we have in front of us. 

The real fun comes when we use statistics to scientifically predict things. When I was a child, I always wanted to become a weather person...I dreamed about how everyone would love me because of my ability to predict the weather. Almost like magic, I would predict impending weather disasters and save entire cities. Well, I became a statistician, and you are becoming one too! So, we can do similar things (and, sadly, sometimes get it just as wrong as the weather people :(  More on this later...)

Samples: Because collecting data from everyone is too costly!


The entire group of people* you want to study is called the "population". If you can get information from everyone in your population, you are all set! There is nothing else to do besides the descriptive statistics we have been learning: mean, median, mode (measures of central tendency), standard deviation, variance, skew, kurtosis, and maybe some nice visuals: bar chart, pie chart, histogram, frequency table and so on.

Those things tell you all about--or describe--your population.

So, where do inferential statistics come in? We need inferential statistics when we are no longer able to gather data from the entire population!


Gathering data from every person in the entire population is called a "census". Hence the name of the U.S. Census that happens every 10 years--they are trying to gather data from everyone!

Q. So, why not gather data from the whole population?
A. It is too costly!

If you can afford to collect data from every member of your population (the group of people that you want to study), then do it! Usually, that is not the case.

{Quiz}

If you need to know how many people in your office prefer cheese pizza, pepperoni, or black olive, then what is your population?

{Quiz}

Thanks for taking that quiz! Hopefully by now you realize that your population is the group of people you want to study!

After you ask everyone in your office, you might have a frequency table like this:

Fx %
Cheese 5 35.7%
Pepperoni 7 50.0%
Black Olive 2 14.3%
TOTAL 14 100.0%

Now, there is nothing else to do! You know you will need 35.7% of your pizza to be cheese, 50% to be pepperoni and so on (hopefully the pizza place is good at math!).

But what if your population is on a bigger scope? Sociologists, doctors and others are often interested in national trends (or even global).

So, what if your question is: "What are the pizza preferences OF AMERICANS?"

Imagine that. Really imagine that. If your next assignment said, "Find out the pizza preference of each and every American." How would you do it? A survey would cost you THOUSANDS of dollars even if you had already identified every American. Even then, perhaps only 10% will respond without an incentive. Maybe if you gave everyone a $25 gift card for participating, you could get 50% of people to respond. That would cost you OVER $4 BILLION! Let's make it a $5 gift card and assume we still get a 50% response rate (which we probably won't). That is only (*sarcasm alert*) going to cost you $750,000,000. All of this isn't even taking into account the time it will take to do the surveys and analyze the data or what it will cost to hire staff to do all of those things.

Bottom line: We need a better way!

It turns out that we can estimate (key word alert!) ESTIMATE these preferences by taking a sample, a smaller number of people from our total population.

YOU WILL NOW BE LEARNING 2 RULES OF SAMPLING IN DETAIL:

1) If you use a random sample, you can use it to estimate the "true" statistics of your population (The "true" statistics of your population are called parameters. So we have Sample Statistics and Population Parameters! SS, PP).

2) The larger your sample is, the more confident you are that it is a good reflection of the population parameters.

Now, allow me to translate into English from Statistics language using the pizza example:

When you cannot get information about pizza preferences for every single person in the group you are interested in studying, you can use a smaller number of people to estimate pizza preferences for the large group! Using this smaller group means that we may or may not get numbers that match the large group. But we can improve our chances of it by doing two things. First, choose your smaller group without any "rhyme or reason". More specifically, you need to use a method of selection that gives every person in the population a chance to be chosen, and that gives everyone an equal change of being chosen. Rolling dice is a good example (as long as it is not a weighted di from Las Vegas!). The second thing is the get the biggest number of people that you can! The more people you have, the more likely it will be that your numbers match what is really going on in the population!

Whew! See why Statistics is a helpful language? Instead of the whole paragraph above, once you know Statistics language we can just say: You may use a random sample to estimate population parameters and those estimates will be less biased the larger the sample is.

Don't worry! You will get the hang of all of this as we go...The key is to remember that Samples have Statistics and Populations have Parameters.

*Here, we will use the term "people" but it is not limited to that. If you are studying Redwood Trees in a certain forest, your population is the group of trees. It might also be fish, amoebas or anything else that you want to study!

1 comment:

  1. Wonderful web journal. I appreciated perusing your articles. This is really an awesome read for me. I have bookmarked it and I am anticipating perusing new articles. Keep doing awesome!

    inferentialstatistical

    ReplyDelete