**Estimating with
Confidence:**

Suppose I want to know how often teenagers go to the movies. Specifically, I want to know how many times per month a typical teenager (ages 13 through 17) goes to the movies.

Suppose I take an SRS of 100 teenagers and calculate the sample mean to be .

The sample mean is an unbiased estimator of the unknown population mean , so I would estimate the population mean to be approximately 2.1. However, a different sample would have given a different sample mean, so I must consider the amount of variation in the sampling model for .

▪ The sampling model for is approximately normal.

▪ The mean of the sampling model is .

▪ The standard deviation of the sampling model is assuming the population size is at least 10n.

Suppose we know that the population standard deviation is . Then the standard deviation for the sampling model is

Then 95% of our samples will produce a statistic that is between and .

Therefore in 95% of our samples, the interval between and will contain the parameter .

The **margin of error**
is 0.10.

For our sample of 100 teenagers, . Because the margin of error is 0.10, then we are 95% confident that the true population mean lies somewhere in the interval , or [2.0, 2.2].

The interval [2.0, 2.2] is
a **95% confidence interval** because we are 95% confident that the unknown
lies between 2.0 and 2.2.

Start with sample data. Compute an interval
that has probability C of containing the true value of the parameter. This is
called a **level C confidence interval**.

How do we construct confidence intervals?

Since the sampling model of the sample mean is approximately normal, we can use normal calculations to construct confidence intervals.

For a 95% confidence interval, we want the interval corresponding to the middle 95% of the normal curve.

For a 90% confidence interval, we want the interval corresponding to the middle 90% of the normal curve.

And so on…

If we are using the standard normal curve, we
want to find the interval using ** z-values**.

**********************************

Suppose we want to find a 90% confidence
interval for a standard normal curve. If the middle 90% lies within our
interval, then the remaining 10% lies outside our interval. Because the curve
is symmetric, there is 5% below the interval and 5% above the interval. Find
the ** z-values** with area 5% below and 5% above.

These ** z-values** are denoted
. Because they come from the standard normal curve, they are
centered at mean 0.

is called the **upper p critical value**, with probability
p lying to its right under the standard normal curve.

To find p, we find the complement of C and divide it in half, or find .

For a 95% confidence interval, we want the **
z-values** with upper p critical value 2.5%.

For a 99% confidence interval, we want the **
z-values** with upper p critical value 0.5%.

Remember that ** z-values **tell us
how many standard deviations we are above or below the mean.

To construct a 95% confidence interval, we want to find the values 1.96 standard deviation below the mean and 1.96 standard deviations above the mean, or .

Using our sample data, this is , assuming the population is at least 10n.

In general, to construct a level C confidence interval using our sample data, we want to find .

The estimate for is .

The margin of error is
. Note that the margin of error is a positive ** number**.

We would like **high** confidence and a **
small** margin of error.

A higher confidence level means a higher percentage of all samples produce a statistic close to the true value of the parameter. Therefore we want a high level of confidence.

A smaller margin of error allows us to get closer to the true value of the parameter, so we want a small margin of error.

*****************************

**
So how do we reduce the margin of error?**

▪
Lower the confidence level (by
decreasing the value of *z**)

▪ Lower the standard deviation

▪ Increase the sample size. To cut the margin of error in half, increase the sample size by four times the previous size.

**** You can have **high** confidence and a **
small** margin of error if you choose the right sample size.

To determine the sample size *n* that will
yield a confidence interval for a population mean with a specified margin of
error *m*, set the expression for the margin of error to be less than or
equal to *m* and solve for *n*.

**CAUTION!!**

These methods only apply to certain
situations. In order to construct a level C confidence interval using the
formula the data must be an SRS

2) we must know the
population standard deviation

3) we want to eliminate (if possible) any
outliers.

The margin of error only covers random sampling
errors.

Things like under-coverage, non-response, and poor sampling designs can
cause additional errors.