| Contents of this page: |
| Assumptions |
| Confidence Interval |
| Hypothesis Test |
| Sample Size |
| Paired Data |
After all the theory, here are some examples. Actually, we have discussed almost everything here already.

Example Consider again the data set for newborn babies and the drug status of their mothers. Previously we found the following summary information:

Find 90% confidence intervals for the mean length of the babies in the three groups
Note 100(1-a)% = 90%, so a=0.1
1) Drug free
n=39,
= 51.1, s=2.9, tn-1,a/2 = t38,0.05 = 1.686, so
± tn-1,a/2×s/√n = 51.1 ± 1.686×2.9/√39 = 51.1±0.57 or (50.53,51.67)
2) First Trimester
n=19,
= 49.3, s=2.5, tn-1,a/2 = t18,0.05 = 1.734, so
± tn-1,a/2×s/√n = 49.3 ± 1.734×2.5/√19 = 49.3±0.99 or (48.31,50.29)
3) Throughout
n=36,
= 48.0, s=3.6, tn-1,a/2 = t35,0.05 = 1.67, so
± tn-1,a/2×s/√n = 48.0 ± 1.67×3.6/√36 = 48.0±1.0 or (47.0,49.0)
MINITAB uses the command Stat > Basic Statistics > 1-Sample t to do all the work, find confidence intervals and much more:
Example: Find a 90% confidence interval for the mean length of babies of mothers who never took cocain.
Solution.
First we need to check the assumptions:
Graph > Boxplot, Simple, Drug Free and Graph > Probability Plot, Single, Drug Free
both show that the data is reasonably normal
Stat > Basic Statistics > 1-Sample t, Samples in column=Drug free, Options > Confidence level=90.0
A 90% CI for the length of babies of Drug Free mothers is (50.318, 51.882)
The details of the hypothesis test for a population mean are as follows:
Null Hypothesis: H0: m = m0
Note: m0 is not "m0" but a specific number which you need to get from the problem.
Alternative Hypothesis: Choose one of the following, depending on the problem:
a) Ha: m > m0
b) Ha: m < m0
c) Ha: m ≠ m0
Test Statistic:
Rejection Region:
If your alternative is a) Ha: m > m0, then reject H0 if T > tn-1, a
If your alternative is b) Ha: m < m0, then reject H0 if T < -tn-1, a
If your alternative is c) Ha: m ≠ m0, then reject H0 if |T| > tn-1, a/2
Example Test at the 10% level whether the mean length of "Drug Free" babies is more than 50cm.
Solution by hand:
1) Parameter: mean m
2) Method: 1-sample t
3) Assumptions: boxplots and normalplots show no problem with normal assumption
4) a = 0.1
5) H0: m = 50.0 (mean length is 50cm)
6) Ha: m > 50.0 (mean length is more than 50cm)
7) T = √n(
-m0)/s = √39(51.1-50)/2.9 = 2.37
8) We reject H0 if T > tn-1, a = t38, 0.1 = 1.304.
T = 2.37 > 1.304, so we do reject the null hypothesis
9) The mean length is statistically significantly higher than 50cm
Solution by computer:
1) Parameter: mean m
2) Method: 1-sample t
3) Assumptions: boxplots and normalplots show no problem with normal assumption
4) a = 0.1
5) H0: m = 50.0 (mean length is 50cm)
6) Ha: m > 50.0 (mean length is more than 50cm)
7) p-value = 0.011 (Stat > Basic Statistics > 1-sample t > Samples in column: Drug Free, test mean: 50.0, Options > Alternative: greater than)
8) p-value = 0.011 < a=0.1, so we do reject the null hypothesis
9) The mean length is statistically significantly higher than 50cm
Notice the advantage of the p value approach: p=0.011 clearly shows that this was a close decision, after all, if we had chosen a=0.01 we would have failed to reject H0.
error.
Once we have some idea what s is we can replace the s in our formula with this s and then solve for n:
Example: We found that a 90% confidence interval for the mean length of babies of "Drug Free" mothers is (50.53,51.67), or 51.1±0.57, so the error on this estimate is 0.57. What sample size would be needed to find a 90% confidence interval with an error of 0.5?
We can use the sample standard deviation s=2.9 as a guess for the population standard deviation, so we have s=2.9. We want a 90% confidence interval, so we have
100(1-a)%=90%, a=0.1, za/2 = z0.05 = 1.645. Therefore
n = (za/2×s/E)2 = (1.645×2.9/0.5)2 = 91.03 ~ 92
| Subject | Weight Before | Weight After |
| Paul | 189 | 175 |
| July | 135 | 129 |
| Linda | 156 | 163 |
| Carlos | 213 | 192 |
| ... | ... | ... |
| Jose | 191 | 196 |
The analysis of paired data is actually quite simple: compute the difference for each pair, and then treat it as a one-sample problem for the population mean.
So for the data above compute the differences
| Subject | Weight Difference |
| Paul | 14 |
| July | 6 |
| Linda | -7 |
| Carlos | 21 |
| ... | ... |
| Jose | -5 |
Warning: don't forget the minus signs!
Some comments:
1) the most natural null hypothesis for a two sample problem is H0: md = 0. Here md is the population mean of the difference
2) the sample size n is the number of pairs (or differences), not the number of observations in the original sample
Example: In order to see whether there is a difference in the prices of foods at two local supermarkets we bought a basket of the same items at each of the two stores. Test at the 5% level whether the mean prices are the same. The data is in foods .
Solution by hand: First compute the mean and the standard deviation of the differences:
Make column diff with Calc > Calculator, Store in diff, Expression: 'Market 1' - 'Market 2'
Stat > Basic Statistics > Display Descriptive Statistics, diff
1) Parameter: mean of differences md
2) Method: 1-sample t
3) Assumptions: boxplots and normalplots show no problem with normal assumption
4) a = 0.05
5) H0: md = 0 (mean prices are the same)
6) Ha: md ≠ 0 (mean prices are different)
7) n=15,
=-0.0367, s=0.157
T = √n(
-md)/s = √15(-0.0367-0)/0.157 = -0.9
8) We reject H0 if |T| > tn-1, a/2 = t14, 0.025 = 2.14, |T| = 0.9 < 2.14, so we fail to reject the null hypothesis
9) it seems the mean prices in the two stores are not very different
Solution by computer:
1) Parameter: mean of differences md
2) Method: 1-sample t
3) Assumptions: boxplots and normalplots show no problem with normal assumption
4) a = 0.05
5) H0: md = 0 (mean prices are the same)
6) Ha: md ≠ 0 (mean prices are different)
7) p = 0.381 (Stat > Basic Statistics > Paired t, First Sample: Market 1, Second Sample: Market 2)
8) p = 0.381 > 0.05 = a, so we fail to reject the null hypothesis
9) it seems the mean prices in the two stores are not very different
We could have also run a 1 sample t test on "diff", with exactly the same result.
Example: Kelby Childers asked subjects to perform several tasks before and after 24 hours of sleep deprivation. One task involved the subjects lifting weights until muscle failure. The data in sleep is a count of the number of bench presses done before ("Pre") and after ("Post") the 24 hours. Find a 95% confidence interval for the mean difference in the number of bench presses.
Solution by computer: Make column diff with Calc > Calculator, Store in diff, Expression: 'Pre' - 'Post'
Draw boxplot and and normal plot of diff to check normality, seems ok.
Stat > Basic Statistics > Paired Data, First Sample: Pre, Second Sample: Post
95% CI for mean difference: (1.46524, 3.40976)
Solution by hand: A boxplot and the normal plot show the differences in presses to be reasonably normal.
Stat > Basic Statistics > Display Descriptive Statistics, diff
n=16,
=2.438, s=1.825
100(1-a)=95, a=0.05, t15,0.025=2.131
±tn-1,a/2s/√n = 2.438±2.131·1.825/√16 = 2.438±0.972
So a 95% confidence interval for the mean difference in bench presses is (1.466, 3.41)
Note that the confidence interval indicates that the difference is positive, that is the subjects were able to bench press more before the sleep deprivation than after it, just what we would expect if there is an effect at all.
For more on inference for a single mean see page 333, page 341 and page 412 of the textbook.