Introduction to Statistics

The big word is “variability.”  We use data (plural) to study variability.  Data is NOT numbers, it is INFORMATION.  Data must be recorded systematically and must have a context.

Data comes from the “population” which is made up of cases and individuals.  Often the population is too large to study directly so we study it and draw conclusions from PARTS of it called SAMPLES.

Data can be organized into tables having rows and columns similar to a matrix.  The columns have headings that correspond to the variables being measured.  Each row represents a single individual, or CASE, that holds all the info for a single individual.

In studying samples, we examine 2 kinds of variables named categorical and quantitative.  Although people like to think of categorical as groups of like things that are NOT numbers, occasionally they may include numbers ie., football jersey numbers…these are categorical since they describe position played rather than a numerical notion.

Quantitative variables can be discrete (finite number) or continuous (comprising an interval between beginning and ending points).

Complete index card with this info:

1. gender
2.  age in years
4.  height in inches
5.  favorite car color
6.  # children in your family

Identify each variable
by “type.”

There are different ways to measure (gather) data like measuring instrument (tape, ruler), written test, surveys, or observations. When you know HOW a variable has been measured, you already know a great deal about it.  When we answered the questions about ourselves we responded to a survey.  Of course, the validity of the data is linked to the integrity of the individual responses.  Later, when we create our own plan to measure a variable we must consider the measurement instrument very carefully.

Let’s select one of the categorical variables and display the data in an appropriate way and make some reasonable statements about the picture and relationships shown.  Categorical data can best be pictured in a bar chart or pie chart.

We will first look at number of children in your family...and present this data in picture form.  This variable is an example of a quantitative, discrete variable.  ALWAYS include labels on your axes.  The first item we look at with the data is “SHAPE.”  We are NOT interested in calculations until the shape and a few other items of interest are determined.

Since we are graphing a single variable this case is called univariate data.  (Bivariate data has 2 variables and is similar to graphing in the x/y plane.)  There are a variety of ways to picture this type of data:  histogram which is a frequency chart, stem-leaf plot which preserves the values of the data points, dot plot which is a side-ways histogram that doesn’t lose individual data points, and boxplot which gives a picture of data along with additional characteristics that we will do a little later.

We’ll use the stat plot command on the calculator to draw our histogram and use these procedures throughout the course.  Scale matters in drawing a histogram but didn’t matter when we were graphing functions in algebra.  (Scale will be the width of the bars in the histogram.).

PATTERNS occur even within variability.
Let’s do both a histogram and dotplot of the number of children in our families.

Index