If you want three opinions, just ask two statisticians.

Chapter 7

Sec 7.1

Sample spaces need not consist of numbers. In statistics, we are
most often interested in numerical outcomes such as the "count" of an
occurrence. We call X a random variable because its values vary when the
phenomenon is repeated. We use capital letters near the end of the
alphabet like X or Y.

A **random variable**
is a variable whose value is a numerical outcome of a random phenomenon.
When a random variable describes a random phenomenon the sample space S just
lists the possible values of the random variable. There are two ways of
assigning probabilities to the values of a random variable that will dominate
our application of probability as we study statistical inference.

Random variables can be either discrete or continuous. A discrete random
variable X has a "countable number of possible values. The probability
distribution of X lists the values and their probabilities in table form.
The probabilities must satisfy two requirement:

1) every probability p_{i} is a number between 0 and 1

2) p_{1} + p_{2} +,,,+p_{k} = 1.

The probability of any event is found by adding the probabilities p_{i}
of the particular values x_{i }that make up the event.

In Chapters 1 and 2 we used histograms and density curves to describe finite
quantitative data. In this chapter we will use analogous methods to
describe the probabilities of discrete (finite) random variables. For discrete
random variables histograms can be used to display *probability*
distributions instead of table form. We previously used histograms to
picture the distributions of *data*. The height of each bar shows the
probability of the outcome at its base. Because the heights are
probabilities, they add to 1. All the bars in the histogram have the same
width so the areas of the bars also display the assignment of probability to
outcomes. See Ex. 7.2 page 394 for more explanation.

For continuous random variables which have infinite values defined by a given
interval other methods must be employed. We cannot assign probabilities to
EACH individual value of x and then sum since there are INFINITE possible
values. Instead we assign probabilities directly to events using areas
under a density curve. Any density curve has area exactly 1 underneath it,
corresponding to total probability 1.

More formally...

A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event.

__ The probability model for a continuous random variable
assigns probabilities to intervals of outcomes rather than to individual
outcomes. In fact all continuous probability distributions assign
probability 0 to every individual outcome. Only intervals of values have
positive probability.__We ignore the distinction between > and

Because any density curve describes an assignment of probabilities, normal distributions are probability distributions. Recall N(mean, standard deviation) for data which permitted standardization of data to "z scores". Random variables can also be standardized to become a standard normal random variable (Z) having distribution N(0,1) using the same formula.