0% found this document useful (0 votes)
25 views

2019JUNMS

Uploaded by

girdozagne
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

2019JUNMS

Uploaded by

girdozagne
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Institute of Actuaries of India

Subject CS1-Paper A – Actuarial


Statistics

June 2019 Examination

INDICATIVE SOLUTION

Introduction

The indicative solution has been written by the Examiners with the aim of helping candidates. The
solutions given are only indicative. It is realized that there could be other points as valid answers and
examiner have given credit for any alternative approach or interpretation which they consider to be
reasonable.
IAI CS1A-0619

Solution 1:
i) Probability distribution of points (P) scored is:
1 26 26
= (( ) ( ) for P = 0)
(52)3
0 3

1 26 26
(( ) ( ) for P = 1)
(52)
3
1 2

1 26 26
(( ) ( ) for P = 2)
(52)
3
2 1

1 26 26
(( ) ( ) for P = 3)
(52)
3
3 0

[1.5 Marks]
MGF of this function is
1 26 26 26 26 26 26 26 26
= (( ) ( ) × e0 + ( ) ( ) × e1t + ( ) ( ) × e2t + ( ) ( ) × e3t )
(52
3
) 0 3 1 2 2 1 3 0

[1]
Hence MGF of X where X=P1+P2 (subscript 1 and 2 refer to the player 1 and player 2
respectively) can be stated as :

𝐸(𝑒 𝑡𝑋 ) = 𝐸(𝑒 𝑡 ∑ 𝑃𝑖 ) = ∏ 𝐸(𝑒 𝑡𝑃𝑖 )


[1]
As the draws are independent,
[0.5]
2
∏ 𝐸(𝑒 𝑡𝑃𝑖 ) = (𝐸(𝑒 𝑡𝑃 ))

1
= ((52) ((26
3
) + (26
1
)(26
2
) × e1t + (26
2
)(26
1
) × e2t + (26
3

3
2
3t
e )) [1]

[5]
𝑑(𝐸(𝑒 𝑡𝑋 ))
ii) Using MGF, 𝐸(𝑋) = 𝑓𝑜𝑟 𝑡 = 0 [1]
𝑑𝑡
1
226 26 26 26 26 26
𝐸(𝑋) = 52 (( ) + ( ) ( ) × e1t + ( ) ( ) × e2t + ( ) × e3t )
( ) 3 1 2 2 1 3
3
26 26 26 26 26
× (( ) ( ) × e1t + 2 ( ) ( ) × e2t + 3. ( ) × e3t ) 𝑓𝑜𝑟 𝑡 = 0
1 2 2 1 3

[1]

Page 1 of 11
IAI CS1A-0619

2 12 2
Hence 𝐸(𝑋) = (52) (2 (26
3
) + 2 (26
1
)(26
2
)) × (3(26
2
)(26
1
) + 3. (26
3
)) = (52) ( (26
3
) + (26
1
)(26
2
))
3 3

[1]
[3]
[8 Marks]

Part (i) was not answered correctly by majority of the candidates and hence they could not
attempt the Part (ii). Students who could identify the PDF of the event scored highly on the
question.

Solution 2:

i)
∑ 𝑋𝑃𝑢𝑏𝑙𝑖𝑐 = 270 [0.5]
(𝑋̅𝑃𝑢𝑏𝑙𝑖𝑐 ) = 30 [0.5]
∑ 𝑋𝑃𝑟𝑖𝑣𝑎𝑡𝑒 = 315.5 [0.5]
(𝑋̅𝑃𝑟𝑖𝑣𝑎𝑡𝑒 ) = 35.06 [0.5]
2
∑ 𝑋𝑃𝑢𝑏𝑙𝑖𝑐 = 8614.5 [1]
(subscript 1 refers public hospitals)
𝑆12 = (8614.5 − 9 × 302 )/8 = 64.31 [1]
2
∑ 𝑋𝑃𝑟𝑖𝑣𝑎𝑡𝑒 = 11599.25 [1]
2
𝑆2 = (11599.25 − 9 × 35.062 )/8 = 67.42 [1]
[𝟔]

ii) Two sided t-test can be applied in case the samples come from populations with
equal variances.
[1]
We are testing 𝐻0 ∶ 𝜎12 = 𝜎22 𝑣𝑠 𝐻1 ∶ 𝜎12 ≠ 𝜎22
[1]
𝑆 2 ⁄𝜎2
Test statistic is 𝑆12 ⁄𝜎12 ~𝐹𝑛1 −1,𝑛2−1
2 2
[0.5]
64.31
Value of statistic is 67.42 = 0.9542
[0.5]
𝐹8,8 values at 5% levels are 0.2256 and 4.433 . Since the value of the test statistic is
between the above values, we have insufficient evidence to reject the hypothesis and
conclude that the population variances are equal.
[1]
[𝟒]

iii) We are testing 𝐻0 ∶ 𝜇1 = 𝜇2 𝑣𝑠 𝐻1 ∶ 𝜇1 < 𝜇2


[0.5]
(𝑋̅2 −𝑋̅1 )−(𝜇2 −𝜇1 )
Test statistic is ~𝑡𝑛1 +𝑛2 −2
2 (1⁄𝑛 +1⁄𝑛 )
√𝑆𝑃 1 2

[1]
Page 2 of 11
IAI CS1A-0619
𝑆12 (𝑛1 −1)+𝑆22 (𝑛2 −1)
Where 𝑆𝑃2 = (𝑛1 +𝑛2 −2)
[1]
Using the values in section (I) above,
64.31×8+67.42×8
𝑆𝑃2 = 16
= 65.86
[0.5]
value of test statistic is
(35.06−30)−0
= 1.32
√65.86(1⁄9+1⁄9)

[0.5]
𝑃(𝑡16 > 1.32) = 20.5%
[0.5]

This is higher than 95% hence we do not have sufficient evidence to reject the
hypothesis and hence conclude that the cost of claims in private hospitals is similar to
that in public hospital
[1]
[5]
[15 Marks]

Part (i) and (iii) of this question was generally well answered by most of the candidates. Many
candidates could not attempt the part (ii) correctly.

Solution 3:
i)

[0.5 Marks for each step]

[0.5 Marks for each step]


[3]

Page 3 of 11
IAI CS1A-0619

ii)

[2]

[2]
[4]
iii)

[2]
[9 Marks]
This question was well answered by most of the candidates. Some candidates lost marks due
to incorrect computation or evaluation of the integrals but the concepts were correctly
applied by majority of the candidates.

Page 4 of 11
IAI CS1A-0619

Solution 4:
i) The slope and intercept parameters can be derived as the expected value of β and α in the
following equation

𝑦 = 𝛼 + 𝛽𝑥+∈ [0.5]
Where, ∈ are the error terms that are assumed to be identical independently
distributed normal random variables.
Under linear regression, α and β can be found by minimizing the squared errors -
distance between the observed and predicted values of y.

The mathematical expressions of the expected values of α and β are


∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ ) (𝑦𝑖 − 𝑦̅)
̂
𝛽=
∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
[2]

𝛼̂ = 𝑦̅ − 𝛽̂ ∗ 𝑥̅ [0.5]
Using the data given,
Slope = 6.825, intercept = 8.84 [3]
Alternative answer : intercept: 7.65, slope 6.78
*********************************************************************

Splitting the total sum of squares into regression and residual sum of squares:
SS Total = SS Regression + SS Residual
Using standard notations; SS Total = S yy; SS Regression = S2XY/ S XX
[6]
ii) R – squared = SS Regression / SS Total = 130425.8 / 149597.4 = 0.8718
alternate answer : 86.86% [2]

̂2
𝜎 3834.34
iii) Standard error of 𝛽 = √ = √ = 1.170216 [1.5]
𝑆𝑥𝑥 2800

The two sided 95% confidence interval for β = 𝛽̂ ± 𝑡0.025,5 ∗ 𝑠𝑒(𝛽)


i.e. 6.825 ± 2.571 ∗ 1.170216 = (3.8164, 9.8336) [1.5]
alternate answer: 6.78 +/- 2.571*1.1782
[3]

iv) ANOVA Table

Source of variation Degrees of freedom SS MSS

Regression 1 130425.8 130425.8

Residual 5 19171.68 3834.336

Total 6 149597.4

Page 5 of 11
IAI CS1A-0619

F-test:
H0: β = 0
F-statistic = 130425.8/ 3834.336 = 34.01521 on (1,5) degrees of freedom [1]
Critical value of F(1,5) = 10.01 [0.5]
Since, F-statistic is greater than the critical value,
so H0 is rejected at the 2.5%% level. [0.5]

Hence, the slope parameter is statistically significantly different from zero. [1]

alternate answer:

Significance
df SS MS F F
Regression 1 129951.4 129951.4 33.07313 0.00223
Residual 5 19646.07 3929.213
Total 6 149597.4
Note for markers: At all levels of significance (1%, 2.5%, 5%), critical values for F
distribution will be lower than the test statistic value. Marks should be provided for
any level of significance used by the student. [3]

v) The residual plot is a U-shaped graph and the residuals are observed to follow a pattern [1]
The non-random pattern in the residuals indicates that the deterministic portion of
the regression model is not capturing some explanatory information. The
possibilities could include:
1. A missing variable
2. A missing of higher order term of a variable in the model to explain the
polynomial trend in residuals [2]
From, the above graph it looks likely that including a higher order term of the
independent variable should be able to resolve this problem. [3]
[17 Marks]
Question (except part (v) ) was largely well answered. Computational errors were made by
significant number of candidates. In part (v) most of the candidates identified that the
distributaion residuals does not seem to be normal but could not provide any additional
comments beyond that.

Solution 5:

(i) 𝜃̂ said to be unbiased when 𝐸(𝜃̂) = 𝜃 [1]


(ii) measure of the ‘bias’ is given by 𝐸(𝜃̂) − 𝜃 [1]
2
(iii) Mean Square Error (MSE) of this estimator 𝜃̂ = (𝐸(𝜃̂) − 𝜃) [1]
̃
(iv) 𝜃 is efficient as an estimator with lower MSE is said to be more efficient than one with
higher MSE. [1]
(v) An estimator is termed as consistent if MSE converges to 0 as the sample size tends to ∞
[1]

(vi) 𝜃 can be estimated using: [mention any 2 methods, 1 mark each]


Page 6 of 11
IAI CS1A-0619

a. Method of moments: the population moments are equated to the sample moments to
estimate the parameters.

b. Maximum likelihood method: A maximum likelihood function 𝐿(𝜃) = ∏𝑛𝑖=1 𝑓(𝑥𝑖 ; 𝜃) is


𝑑𝐿(𝜃)
generated. A maximum likelihood estimate of the parameter is given by solution to =0
𝑑𝜃

c. Bootstrap method: This is computer intensive method that allows us to avoid making
assumption about the sampling distribution by forming an empirical sampling distribution
which is possible due to re-sampling based on the available sample.
This was a bookwork question and was well answered. In part (vi) some students provided
only the names of the methods without the accompanying narration.
[7 Marks]

Solution 6:
The residuals are based on differences between the observed responses, y, and the
fitted responses, 𝜇̂ .
Pearson residuals
̂
𝑦−𝜇
𝑃𝑒𝑎𝑟𝑠𝑜𝑛 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = [2]
̂)
√𝑉𝑎𝑟(𝜇
Deviance residuals

𝐷𝑒𝑣𝑖𝑎𝑛𝑐𝑒 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑠𝑖𝑔𝑛(𝑦 − 𝜇̂ ) ∗ 𝑑𝑖


𝑤ℎ𝑒𝑟𝑒 𝑑𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑜𝑛𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑜𝑓 𝑦 𝑡𝑜 𝑡ℎ𝑒 𝑠𝑐𝑎𝑙𝑒𝑑 𝑑𝑒𝑣𝑖𝑎𝑛𝑐𝑒 (∑ 𝑑𝑖2 ) [2]
[4 Marks]
This was a bookwork question and was generally well answered. Some students missed on
providing the meaning of the terms used.

Solution 7:
If B1, B2, B3,…,Bk constitute a partition of a sample space S and P(Bi) ≠ 0 for i=1,2,3,….k,
then for any event A in S such that P(A) ≠ 0:
𝑃(𝐴|𝐵𝑟 )𝑃(𝐵𝑟 )
𝑃(𝐵𝑟 |𝐴) = 𝑤ℎ𝑒𝑟𝑒 𝑃(𝐴) = ∑𝑘𝑖=1 𝑃(𝐴|𝐵𝑖 ) ∗ 𝑃(𝐵𝑖 ) 𝑓𝑜𝑟 𝑟 = 1, 2, 3, … . 𝑘
𝑃(𝐴)

[1.5]
Derivation:
P (A∩B) = P (A) P (B|A)
𝑃(𝐴∩𝐵)
On rearranging: 𝑃(𝐵|𝐴) = 𝑃(𝐴)

However, 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐵 ∩ 𝐴) = 𝑃(𝐵)𝑃(𝐴|𝐵)


Now, replacing B by Br, we have:
𝑃(𝐵𝑟 ∩𝐴) 𝑃(𝐵𝑟 )𝑃(𝐴|𝐵𝑟 )
𝑃(𝐵𝑟 |𝐴) = =
𝑃(𝐴) 𝑃(𝐴)

Page 7 of 11
IAI CS1A-0619

And from the law of total probability:


𝑃(𝐴) = ∑𝑖 𝑃(𝐴|𝐵𝑖 ) ∗ 𝑃(𝐵𝑖 ) [2.5]
[4 Marks]
Many students failed to provide the proof. Majority of the attempts were limited to statement
of the Bayes’ Theorem

Solution 8:
i) Wickets taken per 500 balls follow Poi(5) distribution. As the number of trials (balls) is very
high and poisson parameter >= 5, we can use normal approximation to Poison Distribution.
[0.5]
Thus the wickets take approximately follow N(5,5). [0.5]
Hence we need ‘X’ such that:
𝑋−5
𝑃 (𝑍 > ) = 0.95 [0.5]
√5

Critical value at 95% confidence is 1.65 [0.5]


Thus
𝑋−5
𝑃 (1.65 > ) = 0.95
√5

Hence 𝑋 < 8.68 [0.5]


As number of wickets can only take whole values, we need to truncate the number to
lower whole number. Hence the team takes upto 8 wickets at 95 % confidence level.
[0.5]
[3]
ii) For team B the runs in 50 ball will follow Bin(50,0.4) [0.5]
The mean and variance for this Binomial distribution are 50 × 0.4 = 20 𝑎𝑛𝑑 50 × 0.4 ×
(1 − 0.4) = 12 respectively [1]
For large number of trials and probability of success is close to 0.5 (or np>10), normal
approximation can be applied & thus the runs per 50 balls follows approximately N(20,12) [1]
26−20
Probability of team B scoring 26 or more runs in 50 balls is thus 𝑃 (𝑍 > ) = 4.16% [1]
√12

The Poisson rate of taking wickets (by team A) is 1 per 100 balls i.e. 0.01 per ball. Hence, the
wickets taken by team A in 50 balls has rate = 0.01 x 50 i.e it follows Poi(0.5) process. [1]
0.50
Probability of not taking any wicket is ∗ 𝑒 −0.5 = 60.65% [1]
1

Hence probability of winning is 1-0.6065=39.34% [1]


Thus Team A has higher probability of winning. [0.5]
[7]

Page 8 of 11
IAI CS1A-0619

iii) Probability that team B bats for 30 balls = (Probability of waiting time (in terms of number
of balls)> 30) x (Probability of A not scoring 26 runs in 30 balls) [0.5]
The waiting time has Exp(0.01) distribution, hence P(T>30)=exp(-0.01*30) = 74.08% [1]
26−30×0.4
Probability of A not scoring 30 runs = 𝑃 (𝑍 < ) = 97.41% [1]
√30×0.4×0.6

Hence there is a 74.08% x 97.41% = 72.16% chance that team B will bat for 30 balls. [0.5]
[3]
[13 Marks]

Most of the students struggled with this question. Not applying CLT, not rounding –off the
number of wickets were the common mistakes in part (i). In part (ii) most of the students
computed probability of scoring ‘exactly’ 26 runs and used that in the answer. Only a handful
of candidates attempted part (iii)

Solution 9:
i) Prior distribution of μ is Gamma (4,7)
74
𝑓𝑝𝑟𝑖𝑜𝑟 (𝜇) = Г(4) ∗ 𝜇 3 ∗ 𝑒 −7∗𝜇

Thus 𝑓𝑝𝑟𝑖𝑜𝑟 (𝜇) 𝑖𝑠 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑎𝑙 𝑡𝑜 𝜇 3 ∗ 𝑒 −7𝜇 [1]

The likelihood is the product of the Poisson probabilities:


𝜇 𝑥1 −𝜇 𝜇 𝑥2 −𝜇 𝜇 𝑥𝑛 −𝜇
𝐿(𝜇) = 𝑒 ∗ 𝑒 ∗… 𝑒
𝑥1 ! 𝑥2 ! 𝑥𝑛 !
The likelihood function is proportional to

𝐿(𝜇)𝑖𝑠 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑎𝑙 𝑡𝑜 𝜇 ∑ 𝑥𝑖 ∗ 𝑒 −𝑛𝜇 [1]

So, 𝑓𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟 (𝜇) 𝑖𝑠 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛𝑎𝑙 𝑡𝑜 𝜇 3+∑ 𝑥𝑖 ∗ 𝑒 −7𝜇−𝑛𝜇

The posterior distribution of μ thus takes the form of a Gamma (4 + ∑ 𝑥𝑖 ,7+n). [1]
[3]
ii) (a) Squared error loss
When n=10 and ∑ 𝑥𝑖 = 15, the posterior distribution of μ is Gamma (19, 17).
The Bayesian estimate of μ under squared error loss is the mean of the posterior
distribution. [1]
Bayesian estimate = mean of posterior distribution = mean of Gamma (19, 17) =
19/17 = 1.1176 [1]
(b) All-or-nothing loss
The Bayesian estimate under all-or-nothing loss is the mode of the posterior
distribution. [1]
To find the mode we need to differentiate the PDF and equate it to zero.
Page 9 of 11
IAI CS1A-0619

𝑓𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟 (𝜇) = 𝑘 ∗ 𝜇18 ∗ 𝑒 −17𝜇 𝑤ℎ𝑒𝑟𝑒 𝑘 𝑖𝑠 𝑎 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 [1]

Taking logs and differentiating:


𝑑 𝑙𝑛𝑝𝑜𝑠𝑡 𝜇 18
= − 17
𝑑𝜇 𝜇
Equating the derivative to zero will give us the value of μ which maximizes the PDF
and thus will give us the mode of the distribution.
μ = 18/17 [0.5]
Differentiating again gives us (-18/μ^2) which is less than zero. This is a check that
the prior step gives us the maxima. [0.5]
So, the Bayesian estimate of all-or-nothing loss is 18/17
(c) Absolute error loss

The Bayesian estimate under absolute error loss is the median of the posterior
distribution. [1]

The posterior distribution follows Gamma (19, 17). Let X denote the posterior
distribution. Hence X ~ Gamma (19, 17). Then 2*17X ~ Chi squared (2*19). [1]

The median of the posterior distribution is the value of M such that P (X<M) = 0.5
Equivalently, P ( ϰ238 < 34M) = 0.5
From the tables we can see that the 50th percentile of ϰ238 is 37.34:
Hence, M = 37.34/34 = 1.098
So, the Bayesian estimate under absolute error loss is 1.098 [1]
[11 Marks]

Part (i) was answered nicely answered by the well prepared candidates. Students made
mistakes in Part (ii) by equting the mean / median / mode to loss measures other than those
given in the solution.

Solution 10:
Let 𝑌𝑖𝑗 and 𝑃𝑖𝑗 be the claim amounts and number of employees covered for company i and
year j respectively.
𝑌
Denote 𝑋𝑖𝑗 = 𝑃𝑖𝑗 ; N=2, n = 4
𝑖𝑗

𝑃̅𝐴 = ∑𝑗 ̅̅̅̅
𝑃𝐴𝑗 = 121 + 119 + 120 + 110 = 470

𝑃̅𝐵 = ∑𝑗 ̅̅̅̅
𝑃𝐵𝑗 = 150 + 135 + 122 + 145 = 552

𝑃̅ = ̅̅̅
𝑃𝐴 + ̅̅̅
𝑃𝐵 = 1022

Page 10 of 11
IAI CS1A-0619
1 470 552
𝑃∗ = 7 ∗ [470 ∗ (1 − 1022) + 552 ∗ (1 − 1022)] = 72.53 [1]

Table for claims per unit employees, 𝑋𝑖𝑗 :

Year1 Year2 Year3 Year4


Company
33.75 36.87 37.92 31.42
A
Company
30.89 23.73 16.99 30.93
B

Using the formulae from the tables,


̅̅̅
𝑋 𝐴 = 35.0745
̅̅
𝑋̅̅
𝐵 = 26.0779
̅
𝑋 = 30.2074

𝐸[𝑚(𝜃)] = 30.2074 [2]

2 4
1 1 1
𝐸[𝑠 2 (𝜃)] = ∗ ∑ ̅̅̅̅
∗ ∑ 𝑃𝑖𝑗 (𝑋 ̅ 2
𝑖𝑗 − 𝑋𝑖 ) = ∗ (1011.036 + 5904.065) = 3457.551
𝑁 𝑛−1 2
𝑖=1 𝑗=1
[2]

1 1
𝑉𝑎𝑟[𝑚(𝜃)] = ∗ (7 ∗ 41214.23 − 3457.551) = 33.5061 [2]
72.53

Putting the above derived values in the formulae:


470
𝑧𝐴 = 3457.551 = 0.81997 [1.5]
470+
33.5061

552
𝑧𝐵 = 3457.551 = 0.842501 [1.5]
552+
33.5061
Using credibility theory, credibility premium per unit risk volume is given by:

𝐶𝑜𝑚𝑝𝑎𝑛𝑦 𝐴: 𝑍𝐴 ∗ 𝑋̅𝐴 + (1 − 𝑍𝐴 ) ∗ 𝐸[𝑚(𝜃)] = 34.1843


𝐶𝑜𝑚𝑝𝑎𝑛𝑦 𝐵: 𝑍𝐵 ∗ 𝑋̅𝐵 + (1 − 𝑍𝐵 ) ∗ 𝐸[𝑚(𝜃)] = 26.72829

The EBCT claim amounts for the coming year for the two states are:
Company A: 4614.88; Company B: 4142.886 [2]
[12 Marks]

This question was attempted by majority of the students. Coputational errors were made by
some while the rest scored highly in this question.

***************

Page 11 of 11

You might also like