Final 221220 Statmeth Solutions
Final 221220 Statmeth Solutions
20 December 2022
1.
a) Because of the Central Limit Theorem, P̂1600 has approximately a normal distribution with mean p,
which is the true population proportion of Dutch citizens who feel that Sinterklaas is an important
part of Dutch tradition, and variance p(1−p)
1600 .
2.
a) The mean weight of three independent and randomly samples patients is a random variable with a
normal distribution and mean µ = 112 and variance σ 2 = 102 /3 = 100/3 ≈ 33.3.
130−112
Thus, P (X 3 > 130) = P (Z > √ )
10/ 3
where Z ∼ N (0, 1) = 1 − P (Z ≤ 3.12) = 1 − 0.9991 = 0.0009.
We need for the test assumptions either sufficiently large samples (both ≥ 30) or that the individual
measurements are normally distributed.
Either condition is satisfied here.
c) Parameter of interest: µ1 − µ2 ,
Hypotheses: H0 : µ1 − µ2 = 0 vs. Ha : µ1 − µ2 < 0.
The choice of test statistic has been made and motivated in b), so we here only give the formula
and state it’s distribution under H0 :
T2eq = √ X 1 −X 2
Sp2 /35+Sp2 /35
3.
a) We use the formula
q q
[P̂1 − P̂2 − zα/2 P̂1 (1−
n1
P̂1 )
+ P̂2 (1−P̂2 )
n2 , P̂1 − P̂2 + zα/2 P̂1 (1−P̂1 )
n1 + P̂2 (1−P̂2 )
n2 ]
for α = 10% which yields
r
h 30 24 30/40 · (1 − 30/40) 24/36 · (1 − 24/36)
− − 1.645 + ,
40 36 r 40 36
30 24 30/40 · (1 − 30/40) 24/36 · (1 − 24/36) i
− + 1.645 +
40 36 r 40 36 r
h3 2 3 1 2 1 3 2 3 1 2 1 i
= − − 1.645 · /40 + · /36, − + 1.645 · /40 + · /36
4 3 4 4 3 3 4 3 4 4 3 3
1 1
≈ [ 12 − 1.645 · 0.104, 12 + 1.645 · 0.104]
≈ [−0.088, 0.255].
b) Parameter of interest: p1 − p2 ,
significance level: α = 5%,
H0 : p1 = p2 vs. Ha : p1 > p2 .
Test statistic:
Z = r P̂1 −P̂2 ,
P (1−P ) P (1−P )
n1
+ n
2
where P = Xn11 +X
+n2 ;
2
c) Exemplary justification:
These two approaches in general yield the same conclusion:
The critical value method gives us a significant test result if the test score exceeds the 95%-quantile
of N (0, 1).
The p-value method gives us a significant outcome if the area to the right of the test score z w.r.t.
N (0, 1) is smaller than 5%. So this is exactly the case if the score exceeds the 95%.
4.
a) This concerns a test of homogeneity because the data were sampled at two different times, and also
because we are primarily interested in the equality/difference of proportions of active users.
Hypotheses:
H0 : p11 = p21 and p12 = p22 vs. Ha : p11 ̸= p21 or p12 ̸= p22 .
Alternatively, phrased in words:
H0 : The “before take-over” population and the “after take-over” population are homogeneous with
respect to their proportion of active/inactive users.
vs. Ha : The “before take-over” population and the “after take-over” population are heterogeneous
with respect to their proportion of active/inactive users.
b) The expected frequencies under the null are given as the product of the marginal totals over the
grand total. Thus,
These expected numbers are all bigger than 5, so the requirements of the test are met.
5.
b) Coefficient estimates:
s 1.011
b1 = r sxy = 0.393 · 196.544 ≈ 0.002
b0 = y − b1 · x = 7.673 − 3991.751 · 0.002 ≈ −0.311
c) In general:
Residual plots can be used to check the fixed error variance assumption, i.e. whether or not the
spread of the residuals is about constant for all values of x or whether there is some trend or some
other strange pattern.
QQ-plots based on the residuals can be used to check the normality assumption for the error terms.
d) This subquestion is granted full points for everybody because sb1 was missing on the
exam. It has been announced but some students already left at that time.
Parameter of interest: β1 , the slope parameter in the linear regression model.
H0 : β1 = 0 vs. Ha : β1 ̸= 0.
Test statistic: SBB1
1
which under H0 (requirements are met, see c)) has approximately a t-distribution with n − 2 =
38 − 2 = 36 degrees of freedom.
0.002
t-score: t = 0.00079 ≈ 2.532.
We want to compare this score to the critical values ±t36,0.025% = ±2.028.
So we see that the test score belongs to the critical region on the right, thus we can reject H0 .
Consequently, we conclude that the Quiz scores do have an influence on the exam grades.