It is commonly
believed that anyone who tabulates numbers is a statistician.
This is like believing that anyone who owns a scalpel is a surgeon.
Inference for a Population Mean
The last chapter provided practice finding confidence intervals and carrying out tests of significance in a somewhat unrealistic setting. We needed the population which we rarely had and were forced to use the sample s as our estimator. In reality and are rarely known. So…let’s get more specific.
The conditions for inference about a mean are as before: SRS and a normal distribution.
For these smaller (n<30) samples ALMOST normal is actually good enough as long as the data are mostly symmetric without multiple peaks or outliers. Smaller samples are better handled with a t distribution since normality cannot be validated using the Central Limit Theorem. (Later, rules of thumb for different small samples.)
WE THEN ESTIMATE THE STANDARD DEVIATION OF XBAR BY s/ as before but now we name this value “Standard Error”.
When the standard deviation of a statistic is estimated from the data, the result is called the standard error of the statistic with code name SE. SO……
SE = s/
Substituting this value for the standard deviation / does not actually yield a normal distribution but rather a “t distribution”. In reality the graph of a t distribution is similar to a normal density curve: symmetric about 0, single peaked, and bell-shaped. The spread is a bit wider giving more probability in the tails and less in the center. (More variation is present when using s in place of . As the df* increase the density curve gets closer and closer to N(0,1)).
*A new term must be considered when working with t distributions and that is “degrees of freedom” df– simply put DEGREES OF FREEDOM = n – 1. (NOTE: there is a different t value for different sample sizes since small samples have high variability.)
We will have to adjust to these changes by switching from the use of Table A to determine critical t* values to using Table C (t-distributions) on the inside back cover of the textbook. Looking at the table we get a clue as to the need for df in determining critical values.
Example 1: What is the critical value (t*) for a t distribution with 18 df and .90 probability to the left of t??? Examining the table headings we find the values returned are for the probability to the right of t*…we must adjust for our question. From the table table we find df = 18 in the column .10 instead of .90 so t* = 1.330
Example 2: Construct a 95% confidence interval for the mean of a population based on an SRS of size n = 12. What critical value t* should be used???? Consult Table C for df = 11 and 95% confidence interval at the bottom of the table. So t* = 2.201. Looking more closely at the table we can see the corresponding critical z value of 1.96 as before.
Calculating a one sample t statistic is similar to finding a
one sample z statistic having this formula
t =( xbar - )/ s/
NOTE: Sometimes data is summarized by giving xbar and its standard error rather than xbar and s. The standard error of the mean xbar is abbreviated SEM.
Example 3: A medical study finds that xbar = 114.9 and s = 9.3 for the seated systolic blood pressure of 27 members of one treatment group. What is the standard error of the mean?
SE = s/ = 9.3/ = 1.789
Example 4: Biologists studying the levels of several compounds in shrimp embryos reported their results in a table, with the note, “Values are means + SEM for 3 independent samples.” The table entry for the compound ATP was .84 + .01. The researchers made 3 measurements of ATP having xbar of .84. What was the sample standard deviation s for these measurements?
We know xbar = .84 and SE = .01 and n = 3 so….we substitute and solve for s
SE = s/
.01 = s/
(.01)( ) = s
(.01)(1.732) = .01732
Finding a one-sample t confidence interval or performing a one sample t test is analogous to the one sample z confidence interval and test.
The confidence interval for a t statistic is
xbar + t * s/
where t* is the upper critical value
A confidence interval or significance test is called robust
if the confidence level or P-value does not change very much
when the assumptions are violated.