Exercise Problems - Statistical Inference

For the problems in this exercise we will use the data in exercise4. This is data on the students in a class. We will use this dataset for all the problems in this exercise.
You should do all the problems both by hand and using the computer where possible.

Problem 1 Find a 90% confidence interval for the percentage of seniors in this class.


Problem 2 Test at the 5% level whether the GPA of the seniors is higher than the GPA of the others. Use the classical approach. Is it possible that you committed the type I or the type II error? Write down in your own words and for this problem what that error is.


Problem 3 Find a 90% confidence interval for the difference of the mean scores in the first and second exam. Does this confidence interval suggest that the scores in the two exams were about the same? If not, which one was better?


Problem 4 Find a 99% confidence interval for the mean score in the Final exam. If a class typically has 30 students and there are 2 sections per semester, how long will we have to wait until we can give this confidence interval with an error of 2.5?


Problem 5 Find a 95% confidence interval for the percentage of women in this course. What sample size would be needed to give this confidence interval with an error of 5%?


Solutions

Problem 1 Parameter: percentage p, so this is a problem of inference for proportions
24 of the 30 students are seniors, so = 24/30 = 0.8.
Assumptions: n=30·0.8=24≥5 and n(1-)=30·(1-0.8)=6≥5
100(1-α)% = 90%, so α=0.1, so α/2=0.05 and z0.05 = 1.645. Therefore
±zα/2√((1-)/n) =
0.8±1.645√(0.8×0.2/30) =
0.8±0.12
and we find a 90% confidence interval for the percentage of seniors to be (68%, 92%)


Problem 2

By hand:
1) Parameters: two means
2) Method: 2-sample t-test
3) Assumptions: boxplot of GPA by Seniors looks good, equal variance looks fine
4) α = 0.05
5) H0: μ1 = μ2 (the mean GPA's are the same)
6) Ha: μ1 < μ2 (Seniors have a higher mean GPA than others)
7) The summary statistics are as follows:
Non Seniors Seniors
n1=6 n2=24
1=2.93 2=3.00
s1=0.50 s2=0.57

Now


8) We reject H0 if T<-tn1+n2-2,α = -t28,0.05 = -1.7011, T = -0.27 ≥ -1.7011, so we fail to reject H0
9) the GPA's of the seniors do not seem to be higher than those of the other students.

Using MINITAB:
1) Parameters: two means
2) Method: 2-sample t-test
3) Assumptions: boxplot of GPA by Seniors looks good, equal variance looks fine
4) α = 0.05
5) H0: μ1 = μ2 (the mean GPA's are the same)
6) Ha: μ1 < μ2 (Seniors have a higher mean GPA than others)
7) p=0.3874 (Stat > Basic Statistics > 2-Sample t)
8) We reject H0 if T<-tn1+n2-2,α = -t28,0.05 = -1.7011, T = -0.27 ≥ -1.7011, so we fail to reject H0
9) the GPA's of the seniors do not seem to be higher than those of the other students.

We failed to reject the null hypothesis, so we might have committed the type II error, that is we might have concluded that the GPA's are the same although they really are not.


Problem 3 Parameters: two means, so this is a problem of inference for two means, paired data

• by hand:
first we find the differences (d=Exam1-Exam2) and then we find the sample mean (-4.13) and standard deviation (4.64). With this we find
(1-α)100% = 90%, so α=0.1, tn-1,α/2 = t29,0.05 = 1.6991

and so a 90% confidence interval for the difference of the mean scores in the two exams is (-5.77,-2.69)
• MINITAB Stat > Basic Statistics > Paired Data

The confidence interval only contains negative values, so it appears that the scores on the first exam were lower than those on the second exam.


Problem 4 Parameter: one mean, so this is a problem of inference for the mean

• by hand:
We have n=30, =61.23 and s=23.45. Also (1-α)100% = 99%, so α=0.01, tn-1,α/2 = t29,0.005 = 2.7564, and therefore

and so a 99% confidence interval for the mean score on the final exam is (49.43, 73.03).
• MINITAB Stat > Basic Statistics > 1-Sample t

We want E=2.5, so we need a sample size of

so we need 586/30 = 19.5 sections. At 2 sections per semester and 2 semesters per year it will take about 5 years.


Problem 5 Parameter: percentage, so this is a problem of inference for the proportions

We have n=30, X=16 (women), so = 16/30 = 0.53. (1-α)100%=95%, so α=0.05, so zα/2 = z0.025 = 1.96, so

and a 95% confidence interval for the percentage of women is (35%,71%)
For an error of 5% we have E=0.05, and so