Q: How many statisticians does it take to change a light bulb?
A: One—plus or minus three.

 

Chapter 7
    Sec 7.2

Probability is the math language that describes the LONG-RUN regular behavior of random phenomena.

Read the first sentence again until you understand every word.

The mean `x  of a set of observations is their ordinary average.  The mean of a random variable X is also an average of the possible values of X, but with an essential CHANGE to take into account the fact that NOT all outcomes need be equally likely.  See Ex.7.5 page 407.  The mean of X is the LONG RUN AVERAGE you expect for a very large number of times.  Just as probabilities are an idealized description of long run proportions, the mean of a probability distribution describes the long run average outcome.

The common symbol for the mean of a probability distribution is mx ...notice the subscript to indicate this is the mean of a random variable X and not the mean of a normal distribution.  The mean of a random variable X is often called the EXPECTED VALUE of X.  The mean of a discrete random variable is the average of the possible outcomes, but a weighted average in which each outcome is weighted by its probability.  Because the probabilities add to 1, we have total weight 1 to distribute among the outcomes.  The probability distribution of a discrete random variable is given in table form as on page 408 with row 1 giving variable values and row 2 giving corresponding probabilities.  To find the mean of X, multiply each possible value by it probability, then ADD.  Symbolically, it looks like
                                                       
mx = x1p1 + x2p2 + ...+ xkpk

The mean is a measure of the center of a distribution.  The variance and the standard deviation are the measures of spread that accompany the choice of the mean to measure center.  To distinguish between the variance of a data set (s2) and the variance of a random variable we need to change our notation to
sx2.  The definition of the variance of a random variable is similar to the definition of the sample variance from Chapter 1.  That is, the variance is an average of the squared deviation (X - mx)2 of the variable X from its mean.  See page 410 for more detail.

The "LAW OF LARGE NUMBERS"..(holds true for any population)

Draw independent observations at random from any population with finite mean m.  Decide how accurately you would like to estimate m.  As the number of observations drawn increases, the mean ` x   of the observed values eventually approaches the mean m of the population as closely as you specified and then stays that close. (asymptotic - remember????)  The law says broadly that the average of many independent observations are stable and predictable and that averaging over many individuals produces a stable result.

The mean of a random variable is the average of the variable in two senses:
1)  by definition it is the average of the possible values, weighted by their probabilities
2)  by the law of large numbers it is the long run average of many independent observations on the variable.

We are unable to distinguish random behavior from systematic influences which points out the need for statistical inference to supplement exploratory analysis of data.  Probability calculations can help verify that what we see in the data is more than a random pattern.  How large is large depends on the variability of the random outcomes.  The more variable the outcomes, the more trials are needed to ensure that the man outcome is close to the distribution mean.
 

RULES FOR VARIANCES:

The mean of a sum of random variables is the sum of their means, BUT this addition is not always true for variances.  If random variables are independent the association between their values is ruled out and their variances DO ADD.  Two random variables X and Y are independent if knowing that any event involving X alone did or did not occur tells us nothing about the occurrence of any event involving Y alone.  Probability models often assume independence when the random variables describe outcomes that appear unrelated to each other.  You should ask in each instance whether the assumption for independence seems reasonable.

The exact rules for variance can be found on pages 420 and 421.  See Combining normal random variables on page 424.

Index