Estimating with Confidence:
Suppose I want to know how often teenagers go to the movies. Specifically, I want to know how many times per month a typical teenager (ages 13 through 17) goes to the movies.
Suppose I take an SRS of 100 teenagers and calculate the sample mean to be .
The sample mean is an unbiased estimator of the unknown population mean , so I would estimate the population mean to be approximately 2.1. However, a different sample would have given a different sample mean, so I must consider the amount of variation in the sampling model for .
▪ The sampling model for is approximately normal.
▪ The mean of the sampling model is .
▪ The standard deviation of the sampling model is assuming the population size is at least 10n.
Suppose we know that the population standard deviation is . Then the standard deviation for the sampling model is
Then 95% of our samples will produce a statistic that is between and .
Therefore in 95% of our samples, the interval between and will contain the parameter .
The margin of error is 0.10.
For our sample of 100 teenagers, . Because the margin of error is 0.10, then we are 95% confident that the true population mean lies somewhere in the interval , or [2.0, 2.2].
The interval [2.0, 2.2] is a 95% confidence interval because we are 95% confident that the unknown lies between 2.0 and 2.2.
Start with sample data. Compute an interval that has probability C of containing the true value of the parameter. This is called a level C confidence interval.
How do we construct confidence intervals?
Since the sampling model of the sample mean is approximately normal, we can use normal calculations to construct confidence intervals.
For a 95% confidence interval, we want the interval corresponding to the middle 95% of the normal curve.
For a 90% confidence interval, we want the interval corresponding to the middle 90% of the normal curve.
And so onů
If we are using the standard normal curve, we want to find the interval using z-values.
Suppose we want to find a 90% confidence interval for a standard normal curve. If the middle 90% lies within our interval, then the remaining 10% lies outside our interval. Because the curve is symmetric, there is 5% below the interval and 5% above the interval. Find the z-values with area 5% below and 5% above.
These z-values are denoted . Because they come from the standard normal curve, they are centered at mean 0.
is called the upper p critical value, with probability p lying to its right under the standard normal curve.
To find p, we find the complement of C and divide it in half, or find .
For a 95% confidence interval, we want the z-values with upper p critical value 2.5%.
For a 99% confidence interval, we want the z-values with upper p critical value 0.5%.
Remember that z-values tell us how many standard deviations we are above or below the mean.
To construct a 95% confidence interval, we want to find the values 1.96 standard deviation below the mean and 1.96 standard deviations above the mean, or .
Using our sample data, this is , assuming the population is at least 10n.
In general, to construct a level C confidence interval using our sample data, we want to find .
The estimate for is .
The margin of error is . Note that the margin of error is a positive number. It is not an interval.
We would like high confidence and a small margin of error.
A higher confidence level means a higher percentage of all samples produce a statistic close to the true value of the parameter. Therefore we want a high level of confidence.
A smaller margin of error allows us to get closer to the true value of the parameter, so we want a small margin of error.
So how do we reduce the margin of error?
▪ Lower the confidence level (by decreasing the value of z*)
▪ Lower the standard deviation
▪ Increase the sample size. To cut the margin of error in half, increase the sample size by four times the previous size.
**** You can have high confidence and a small margin of error if you choose the right sample size.
To determine the sample size n that will yield a confidence interval for a population mean with a specified margin of error m, set the expression for the margin of error to be less than or equal to m and solve for n.
These methods only apply to certain
situations. In order to construct a level C confidence interval using the
formula the data must be an SRS
2) we must know the population standard deviation
3) we want to eliminate (if possible) any outliers.
The margin of error only covers random sampling
Things like under-coverage, non-response, and poor sampling designs can cause additional errors.