Inference For the Mean of a Population:

 

If our data comes from a simple random sample (SRS) and the sample size is sufficiently large, then we know that the sampling distribution of the sample means is approximately normal with mean and standard deviation .

 

 

PROBLEM:

If is unknown, then we cannot calculate the standard deviation for the sampling model. L

 

 

We must estimate the value of in order to use the methods of inference that we have learned.

 

 

SOLUTION:

We will use s (the standard deviation of the sample) to estimate .

 

Then the standard error of the sample mean is .

In order to standardize , we subtract its mean and divide by its standard deviation.

 

has the normal distribution N( 0, 1)

 

PROBLEM:

If we replace with s, then the statistic has more variation and no longer has a normal distribution so we cannot call it z.

 

It has a new distribution called the t distribution.

 

 

t is a standardized value. Like z, t tells us how many standardized units is from the mean .

 

When we describe a t distribution we must identify its degrees of freedom because there is a different t statistic for each sample size.

 

The degrees of freedom for the one-sample t statistic is (n 1).

 

 

The t distribution is symmetric about zero and is bell-shaped, but there is more variation so the spread is greater.

 

As the degrees of freedom increase, the t distribution gets closer to the normal distribution, since s gets closer to .

 

We can construct a confidence interval using the t distribution in the same way we constructed confidence intervals for the z distribution.

 

 

Remember, the t Table uses the area to the RIGHT of t*.

 

One-sample t procedures are exactly correct only when the population is normal. We assume that the population is approximately normal in order to justify the use of t procedures.

 

A confidence interval or significance test is called robust if the confidence level or P-value does not change very much when the assumptions are violated.

 

The t procedures are strongly influenced by outliers. Always check the data first!

 

If there are outliers and the sample size is small, the results will not be reliable.

 

The t procedures are robust when there are no outliers, especially when the distribution is approximately symmetric.

 

 

When to use t procedures:

         If the sample size is less than 15, only use t procedures if the data are close to normal.

         If the sample size is at least 15, only use t procedures if there are no outliers.

         If the sample size is at least 40, you may use t procedures, even if the data is skewed

 

 

 

 

 

 

 

 

Comparing Two Means:

 

Comparative studies are more convincing than single single-sample investigations, so one-sample inference is not as common as comparative (two-sample) inference.

 

In a comparative study, we may want to compare two treatments, or we may want to compare two populations. In either case, the samples must be chosen randomly and independently in order to perform statistical inference.

 

Because matched pairs are NOT chosen independently, we will NOT use two-sample inference for a matched pairs design. For a matched pairs design, apply the one-sample t procedures to the observed differences.

 

Otherwise, we may use two-sample inference to compare two treatments or two populations.

 

The null hypothesis is that there is no difference between the two parameters.

 

or

 

 

 

The alternative hypothesis could be that

(two-sided)

or (one-sided)

or (one-sided)

 

Before you begin, check your assumptions! For comparing two means, both samples must be an SRS and must be chosen independently. Also, both populations must be normally distributed. (Check the data for outliers or skewedness.)

 

If these assumptions hold, then the difference in sample means is an unbiased estimator of the difference in population means, so is equal to .

 

Also, the variance of is the sum of the variances of and , which is .

 

Furthermore, if both populations are normally distributed, then is also normally distributed.

 

 

In order to standardize , subtract the mean and divide by the standard deviation:

 

 

If we do not know and , we will substitute the standard error and for the standard deviation. This gives the standardized t value:

 

 

CAUTION! This statistic does NOT have a t distribution.

 

We will use the TI-83 to perform two-sample T tests.

 

2-SampTTest for a hypothesis test

2-SampTInt for a confidence interval