It is easy to lie with
statistics, but it is easier to lie without them.

*Frederick Mosteller*

Chapter 3

Sec 3.1

When you examine the relationship between two or more variables, first
ask the preliminary questions as before:

*What individuals do the data describe?

*What exactly are the variables? How are they measured?

* Are all the variables quantitative or is at least one
a categorical variable?

When we have data on several variables, categorical variables are often present
and help organize the data. There is one MORE question you should ask when
examining relations among several variables...

*Do you want to simply explore the nature of the
relationship, OR do you think that some of the variables *explain* or even
*cause*
changes in others? That is, are some variables independent and some
dependent. In statistics we use a previous concept but change the
vocabulary...independent variables (plotted horizontally) are called __
explanatory variables (x) __and dependent variables (plotted vertically)
are called

The techniques used to study relations among variables are more complex than the one-variable methods. However, the principles that guide examination of data are the same:

*
First plot the data, then add numerical summaries.

* Look for overall patterns and deviations from those
patterns.

* When the overall pattern is quite regular, use a
compact mathematical model to describe it.

See Ex 3.1 (page 122) for the concept at work.

The most effective way to display the relation between two quantitative
variables is a** scatterplot**. A scatterplot shows the
relationship between two quantitative variables measured on the same
individuals. The values of one variable appear on the horizontal axis, and
the values of the other variable appear on the vertical axis. Each
individual in the data appears as the point in the plot fixed by the values of
both variables for that individual. If there is no distinction between the
explanatory and response variables, you may plot either on the horizontal axis.

We will discuss Ex. 3.6 (page 125 completely.

Interpreting a scatterplot consists of

* looking for the overall pattern and for striking deviations from that pattern

* describing the overall pattern by the form (may be linear or other mathematical model),

direction (positive or negative), and strength (how closely the point follow a clear form) of the relationship

* an outlier is an individual value that falls outside the overall pattern

* note any clustering of data

See Ex. 3.1 (page 127 for a strong linear relationship.

Of course, NOT all relationships are linear and not all have a clear direction. When introducing another variable (categorical) into the graph, you can use different colors or symbols to plot points and show the distinction.

Index