Lottery: A tax on the statistically-challenged.

Chapter 4
    Sec 4.1

Linear regression using the LSRL is not the only model for describing data.  Some data just are not best described linearly.  Non-linear relationships between two quantitative variables can sometimes be changed into linear relationships by transforming one or both variables.  Removal of outliers from data may cause a drop in the correlation so that linear no longer does a satisfactory job of describing the data.  The bulk of the data may not be linear at all.

Transforming can be thought of as re-expressing the data.  We may want to transform either the explanatory variable x, or the response variable y in a scatter plot, or maybe even both.  We will call the transformed variable "t" when talking about the transforming in general.

Transforming data amounts to changing the scale.  Linear transformations as discussed in Chapter 1 change units by addition or multiplication.  Recall, adding a constant amount to each observation does NOT change the spread but does add that constant to the center and quartiles.  Multiplying by a constant multiplies both the measure of center and the measure of spread by the same constant.  BUT linear transformations of this type CANNOT STRAIGHTEN OUT A CURVED RELATIONSHIP BETWEEN TWO VARIABLES.

Some non-linear functions that have been studied in Algebra include quadratic, logarithmic, reciprocal (negative power), square root, and other power functions.

Monotonic Transformations:

A monotonic function f(t) moves in ONE direction as its argument t increases.
A monotonic increasing function preserves the ORDER of data.  If a>b, then f(a)>f(b).
A monotonic decreasing function reverses the ORDER of data.  If a>b, then f(a)<f(b).

The graph of a linear function is a straight line.  The graph of a monotonic increasing function is increasing EVERYWHERE.  A monotonic decreasing function has a graph that is decreasing EVERYWHERE.  A function can be monotonic over some range of t without being everywhere monotonic.  For example, the square (quadratic, parabolic) function t2 is monotonic increasing for t > 0.  If the range of t includes both positive and negative values, the square function is NOT monotonic since it decreases as t increases for negative values of t and increases as t increases for positive values.  Many variables take only 0 or positive values, so we are particularly interested in how functions behave for positive values of t.  See board for sketches of graphs.

The increasing monotonic functions are linear, quadratic t > 0, and logarithmic.  Even numbered power functions tp are monotonic increasing for t positive.  (Logarithms are not even defined for negative or zero values.)  Order is preserved.  The decreasing monotonic functions are linear, reciprocal, or reciprocal square root.  Power functions tp for negative powers are monotonic decreasing and reverse order also.

Non-linear monotonic transformations change data enough to ALTER the shape of distributions and the form of relations between two variables, yet are simple enough to preserve order and allow recovery of the original data.

Strategy for transforming data:
1)  if the variable to be transformed takes values that are 0 or negative, first apply a linear transformation to make the values all positive.  This can be accomplished by adding a constant to all the observations.
2)  next choose a power or logarithmic transformation that simplifies the that approximately straightens a scatter plot.  (Note:  the use of the word simplifies takes yet another meaning when applied to graphs of data.)

Other things to remember:
1)  Power transformations tp for powers p greater than 1 are concave up.  They push out the right tail of a distribution and pull in the left tail.  This effect gets stronger as the power increases.
2)  Power transformations tp for powers p less than 1 (and logarithmic for p = 0) are concave down.  They pull in the right tail and push out the left tail.  Effect strengthens as p decreases.

There are a variety of transformations available to us depending on the appearance of the data.  Using a "trial and error" or "try it and see" approach isn't very satisfactory or efficient.  It is much more logical to begin with a theory or mathematical model that we expect to describe a relationship   The transformation needed to make the relationship linear is then a consequence of the model.  One of the most common models is EXPONENTIAL growth.

A variable grows linearly over time if it ADDS a fixed increment in each equal time period.  Exponential growth occurs when a variable is MULTIPLIED by a fixed number in each time period.  (Exponential growth increases by a fixed percentage of the previous total).  We have studied exponential growth before when dealing with interest problems, bacterial growth, or epidemic situations.  See Ex. 4.3 page 204 for a good example.

What does exponential growth look like???  Can we trust our eyes to recognize it???  How can we be sure that we have a correct model???

When exponential growth is suspected from viewing the data, the first step is to calculate the ratios of consecutive terms to see that they are approximately the same.  Exponential form is y = abx  (variable in the exponent) and we will transform by taking the log of both sides yielding log (y) = log (abx).  This statement simplifies using one of the log properties to become log (y) = log (a) + log (bx) which becomes
log (a) + x log (b).   When we plot log (y) against x, we should observe a straight line for the transformed data.  Apply least-squares regression to the transformed data and check the correlation (r) value and the coefficient of determination (r2) value for suitability.
Remember  rgives a measure of how successful the regression was in explaining the response.

Even when r2  returns a high value you should ALWAYS inspect the residual plot to further assess the quality of the model.  The residual plot should appear totally random showing no pattern.  If any part of the residual plot appears recognizable, then perhaps a few data points should be removed to see if the r2  can be improved.

We will do Ex. 4.6 page 209 by hand and using technology.

(in general  y = a xp)

When you order pizza you order by its diameter, like 10 inches...BUT when you eat pizza you eat its AREA or A = Πr2.  Notice there is an exponent present but it is not a variable, it is a declared power.  Recall that all power functions have a "look" of their own.  Odds have ends that go in opposite directions, evens have ends that point in the same direction. 

When we are dealing with things of the same general form, whether circles or fish or people, we expect area to go up with the square of a dimension such as diameter or height.  Volume should go up with the cube of a linear dimension.  Geometry tells us to expect power laws in some settings.

Biologists have found that MANY characteristics of living things are described quite closely by power laws.  There are more mice than elephants, and more flies than mice, the abundance of species follows a power law with number of eggs a bird lays, and so on.  Sometimes the powers can be predicted from geometry, but sometimes they are mysterious.  Why, for example, does the rate at which animals use energy go up as the 3/4 power of their body weight??  This particular relationship is called "Kleiber's law and works from bacteria to whales.  Therefore, power laws are a good place to start in simplifying relationships for living things

Exponential growth models become linear when we apply the logarithm transformation to just the response variable y.  Power law models become linear when we apply the logarithm transformation to BOTH variables.

Power law model is  y = a xp

Take the log of both sides to get log (y) = log (a) + p log (x)

Taking the log of  BOTH variables straightens the scatter plot of y against x. 
The power p becomes the slope of the line that links log y to log x.

Follow the steps above to verify that the regression is a good fit using r and r2.

Great, specific directions for accomplishing this appear on page 219 in the technology toolbox.

IF taking logs of both variables makes a scatter plot linear, a power law is a reasonable model for the original data.  We can even estimate what power the law involves by regressing log y on log x and using the slope of the regression line as a good estimate of the power.  Remember the slope is only an estimate of the p in an underlying power model.  The greater the scatter of the points in the scatter plot about the fitted line, the smaller the confidence that this estimate is accurate.  See E. 4.8 page 215 for an example