0% found this document useful (0 votes)
62 views

Assignments 2017 PDF

This document provides instructions for a two-part statistics assignment. Part I consists of 6 questions worth a total of 30 marks. It is due on May 16th and is worth 25% of the total grade. Part II consists of 3 questions focused on using R for big data analysis and risk analysis. It is due on June 6th and also worth 25% of the total grade. Students are expected to submit their code and plots for both assignments.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Assignments 2017 PDF

This document provides instructions for a two-part statistics assignment. Part I consists of 6 questions worth a total of 30 marks. It is due on May 16th and is worth 25% of the total grade. Part II consists of 3 questions focused on using R for big data analysis and risk analysis. It is due on June 6th and also worth 25% of the total grade. Students are expected to submit their code and plots for both assignments.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Assignment Part I

Statistics and Financial Econometrics


Autumn 2017

• This Assignment Part I is individual, containing six questions.

• Three freely chosen questions are compulsory, the remaining are supplementary.

• Each compulsory question yields 10 marks.

• The solution of the supplementary questions gives additional marks up to a sum of 30 marks.

• The submission date is Tuesday 16/05/2017 before the lecture at 6:00pm


(on paper, hand-written is OK).

• The weight of the Assignment Part I is 25%.

QUESTION 1 (10 marks)


Consider the following coin tossing experiment. At the beginning one of two coins is selected
randomly. Then this coin is tossed ten times. The first coin is fair, showing head and tail with
1 3 1
probabilities 2 each. The second coin shows head with probability 4 and tail with 4. Both coins
1
are selected with the same probability of 2. Consider the variables (Yt )10
t=1 taking values in {0, 1}
which indicate the observations such that {Yt = 1} and {Yt = 0} are the events of observing
”Head” and ”Tail” in the experiment t = 1, . . . , 10 respectively. Introduce the variable X taking
values in {1, 2} which indicates which coin has been selected. Consider the events

!10
A = {Y1 = 1, Y2 = 0, Y3 = 1, Y4 = 1}, B={ Yi = 7}
i=1

a) Are the random variables (Yt )10


t=1 independent?

b) Determine the joint distribution of Y = (Y6 , Y7 )

c) Calculate the distribution of X conditioned on A.

d) Calculate three distributions of Y6 conditioned on A, on B , and on A ∩ B .

QUESTION 2 (10 marks)


Suppose that one of five servers i = 1, . . . , 5 can be used to process a certain request. Thereby, each
server i = 1, . . . , 5 rejects the request with probability of i/5 and gives a successful response with
the probability of 1 − i/5. Suppose that the user does not know about the server characteristics
and sends request to a randomly chosen server.

a) Suppose that a request was sent to a server and was rejected. Then it was sent again to
the same server and was rejected again. If the request will now be sent to another server
(randomly chosen from the remaining four), what is the probability of a successful response?

1
b) Suppose that a request was sent to a server and was rejected. Then it was sent to another
server (randomly chosen from the remaining four) and was also rejected. Then the request is
sent to another server (randomly chosen from the remaining three). Calculate the probability
of a successful response.

QUESTION 3 (10 marks)


Consider a card deck consisting of 52 cards with 26 black and 26 red cards. Suppose that the cards
are taken from the deck one by one and are uncovered.

a) Calculate the probability that the third red card occurs at the trial i = 3, 4 . . . , under the
assumptions that each card is returned to a random location (put the card into the deck and
shuffle).

b) Calculate the probability that the third red card occurs at the trial i = 3, 4 . . . under the
assumptions that each card is not returned to the deck but is put aside.

QUESTION 4 (10 marks)


Suppose that company uses three network printers i ∈ {1, 2, 3}. Within each minute, a number
Y of jobs is received and each job is assigned to a printer, randomly chosen from {1, 2, 3} with
equal probabilities 1/3 to be stored in the printer’s queue waiting for processing. Assuming that
the queues are initially empty and that Y follows a Poisson distribution with parameter λ = 5,
denote by X = (Xi )3i=1 the numbers of jobs waiting in the printer’s i = 1, 2, 3 queue after the first
minute.

a) Determine the distribution of X conditioned on {Y = 5}.

b) Determine the distribution of X1 . (Hint: Poisson distribution, determine its parameter.)

c) Determine the probability of the event ∪3i=1 {Xi = 0} that at least one of the queues is
empty.

QUESTION 5 (10 marks)


Let A, B , and C be independent events

a) Supposing that the event probabilities satisfy

P(A \ (B ∪ C)) = 0.1, P(B) = P(C) = 1/2,

determine the probability P(A ∩ B ∩ C).

b) Assume that P(A ∪ B) = 59 , what is the maximal possible value for P(A ∩ B)?

2
QUESTION 6 (10 marks) Consider independent random variables U and V and assume that U
follows a normal distribution with mean µ = 0 and variance σ 2 = 1 and V follows exponential
distribution with scale parameter β = 1. Define the variables

X = U, Y = U +V

a) Determine the joint density fX,Y of the random variables X and Y .

b) Calculate the marginal densities fX and fY .

3
Assignment Part II
Statistics and financial econometrics
Autumn 2017

• This Assignment Part II is individual, containing three questions.

• The submission date on Tuesday 06/06/2017 before the lecture at 6:00pm.

• All problems shall be addressed using R or Julia.

• The code and plots shall be submitted on paper

• The weight of the Assignment Part II is 25%.

Problem 1 (Big Data: 15 marks) Working with big data requires specific skills and software.
The statistical language R provides diverse packages to address big data analysis. One of such
packages is data.table which yields an alternative to the traditional data.frame objects and their
methods. The superior computational efficiency of data.table shall be examined in the following
exercise.

a) Study the traditional data.frame objects and methods including reading and writing data
(read the documentation ?data.frame, ?read.table, ?write.table). Install the package
data.table (by typing install.packages("data.table"), load this package by
require("data.table")) and study its documentation ( ?data.table, ?fread, ?fwrite,
run the examples example(data.table)).

b) Download from our UTSOnline site the file Poloniex.csv.zip (38,7 MB) and unpack it to
Poloniex.csv (971,9 MB). These data contain market snapshots (at regular times) of trad-
ing digital currencies at the Poloniex exchange https://poloniex.com/exchange#btc_eth

c) Read the data using a traditional method


market <- read.csv("Poloniex.csv", skip=1, header=T) and record the time spent on
this operation (use proc.time()).

d) Compare this time to the time spent on market1 <- fread("Poloniex.csv", skip=1, header=T).
Compare also the performance of data writing operations write.csv(x=market, file="test.csv"),
fwrite(x=market1, file="test.csv") to their binary versions
saveRDS(object=market, file="test.rds") and saveRDS(object=market1, file="test.rds").

e) Determine the column names for the object market1. Create a data field prices by
extracting those columns, whose names end with the string _last. (use the command
grep(pattern="_last", ....) to extract these names). Subindex market1 on these names.
The resulting field will contain the last paid prices of digital currencies at snapshot times.

f) Extract the column representing the time of the snapshot from the data field market1 (use
sub-indexing market1[["TIME"]] or market1[,TIME] ), convert these machine times to

1
the calendar time objects using
time_stamps <- as.POSIXct(...,origin = ’1970-01-01’, tz = ’GMT’)
Which time range is encompassed? What is the average time interval between the snapshots?

g) Create and store (as RDS-object, use saveRDS) a data field named five_min_prices which
contains all last prices, sampled at five-minute intervals with the first column representing
the machine time of the snapshot.

Problem 2 (Risk Analysis: 15 marks) Standard work on financial data encompasses calculation
of moments, determining distributions, detecting outliers and estimating covariances. We will use
the prices of digital currencies, sampled from Poloniex exchange at five minute intervals to practice
these steps.

a) Load the data generated in g) of the Problem 1. (Alternatively, use ReadRDS to load the
data file five_min_prices.rds obtained from UTSonline.)

b) Convert the machine times of the snapshots to calendar times (use as.POSIXct as in f) of
the Problem 1) and plot Etherium prices, expressed in BitCoins (this is the column with the
name BTC_ETH_last), against calendar times.

c) Install and load the package (install.packages("e1071"); require("e1071")) containing


routines for skewness and kurtosis calculation. Calculate the price increments (use diff)
of Etherium, expressed in BitCoins and store them in the object diff_price_BTC_ETH_last.
Determine the empirical mean, variance, skewness and kurtosis of diff_price_BTC_ETH_last.

d) Plot empirical quantiles of diff_price_BTC_ETH_last against standard normal quantiles.

e) Determine the 0.001 and 0.999 empirical quantiles of diff_price_BTC_ETH_last and plot
its empirical density in this range using plot(density(diff_price_BTC_ETH_last), xlim=...).
Plot the normal density, whose mean and standard deviation equals those of diff_price_BTC_ETH_last
on the same graph.

f) Determine the average of all increments, which exceeded 0.99-quantile,


for the data diff_price_BTC_ETH_last.

Problem 3 (Principle Component Analysis: 15 marks) Diverse problems in economics,


finance and data sciences are related to the following situation. Suppose a non-deterministic
mechanism returns random objects with values in high-dimensional space (for instance yield curve
on arbitrary chosen date). Although the range of objects varies in high dimensional space, most
of the variations can be explained by appropriate approximation within a certain low dimensional
subspace. In these situations, a selection of such subspace can be addressed by the so-called Prin-
ciple Component Analysis (PCA).

More specifically, assume that a random phenomena produces a large number n ∈ N of p-


dimensional random vectors (x(j) )ni=1 ⊂ Rp which are arranged as rows in the so-called data
matrix X ∈ Mn,p containing n rows and p columns. The goal of the principle component analysis

2
is to extract the so-called principle axes. Therefore, one determines for a given number k ∈ N, the
largest eigenvalues λ1 ≥ · · · ≥ λk of X ! X and the principle axes b1 , . . . , bk ∈ Rp are obtained as
orthonormal eigenvectors of X ! X corresponding to these eigenvalues. With this, given a element
#
" = ki=1 (x! bi )bi .
x ∈ Rp , its principle component approximate is calculated as x

a) Generate the data matrix X using the following code:

set.seed(10) #initialize pseudo random generator for reproducibility


n<-3000; p=100
basis_functions<-list(sin, cos, dnorm, pnorm)
arguments<-seq(from=0, to=1, length=p)
function_values<-matrix(data=0, nrow=length(basis_functions), ncol=length(arguments))
for (i in 1:length(basis_functions)) function_values[i,]<-basis_functions[[i]](arguments)
random_coefficients<-matrix(data=0, nrow=n, ncol=length(basis_functions))
random_coefficients[,]<-rnorm(nrow(random_coefficients)*ncol(random_coefficients), mean=3, sd=1)
X<-random_coefficients%*%function_values

Plot the first 20 rows on the same graph


(as lines, against arguments, using plot(x=arguments, y=..., type="l"))

b) Determine three first principle axes b1 , b2 , b3 and plot them (against arguments).
#k
"(k) =
c) Determine the projections x !
i=1 (x bi )bi for the vector x = (X1,j )pj=1 obtained as the
first row of the data matrix X . Plot x(k) for k = 1, 2, 3 and x
" on the same graph (against
arguments).

3
Solutions Assignment Part II
Statistics and Financial Econometrics
Autumn 2017

#################################################################################
#
# Problem 1
#
#################################################################################
#
# a)
#
#######################################
rm(list = ls())

?data.frame
?read.table
?write.table

#install.packages("data.table")
require("data.table")

?fread
?fwrite

#######################################
#
# b)
#
#######################################

working_directory<-"/home/juri/data/Assignment2-2017/"
setwd(working_directory)

#######################################
#
# c)
#
#######################################

tic<-proc.time()
market <- read.csv("Poloniex.csv", skip=1, header=T);

4
toc<-proc.time()
print(toc-tic)

#######################################
#
# d)
#
#######################################

tic<-proc.time()
market1 <- fread("Poloniex.csv", skip=1, header=T);
toc<-proc.time()
print(toc-tic)

tic<-proc.time()
write.csv(x=market, file="test.csv");
toc<-proc.time()
print(toc-tic)

tic<-proc.time()
fwrite(x=market1, file="test.csv");
toc<-proc.time()
print(toc-tic)

tic<-proc.time()
saveRDS(object=market1, file="test.rds");
toc<-proc.time()
print(toc-tic)

tic<-proc.time()
saveRDS(object=market, file="test.rds");
toc<-proc.time()
print(toc-tic)

#######################################
#
# e)

5
#
#######################################

column_names<-colnames(market1);column_names
last<-column_names[grep(pattern="_last", x=column_names)]
prices<-market1[,..last]
names(prices)

#######################################
#
# f)
#
#######################################
time_stamps <- as.POSIXct(market1[["TIME"]],origin = ’1970-01-01’, tz = ’GMT’);
max(time_stamps); min(time_stamps)
mean(diff(time_stamps))

#######################################
#
# g)
#
#######################################

index<-seq(from=1, to=nrow(market1), by=60)


five_min_prices<-market1[index, ]
timeandlast<-c("TIME", last)
five_min_prices<-five_min_prices[ ,..timeandlast]
names(five_min_prices)
saveRDS(obj=five_min_prices, file="five_min_prices.rds")

################################################################################
#
# Problem 2
#
#################################################################################
#
# a)
#
#######################################

five_min_prices<-readRDS(file="five_min_prices.rds")

6
#######################################
#
# b)
#
#######################################
time_stamps <- as.POSIXct(five_min_prices[["TIME"]],origin = ’1970-01-01’, tz = ’GMT’);
plot(time_stamps , five_min_prices[,BTC_ETH_last] , type="l");

#######################################
#
# c)
#
#######################################
diff_price_BTC_ETH_last<-diff(five_min_prices[,BTC_ETH_last])
#install.packages("e1071")
require("e1071")
mean(diff_price_BTC_ETH_last)
sd(diff_price_BTC_ETH_last)
skewness(diff_price_BTC_ETH_last)
kurtosis(diff_price_BTC_ETH_last)
#######################################
#
# d)
#
#######################################
qqnorm(diff_price_BTC_ETH_last)
#######################################
#
# e)
#
#######################################

from_to<-quantile(diff_price_BTC_ETH_last, probs =c(0.001,0.999))


plot(density(diff_price_BTC_ETH_last), xlim=from_to)
arguments=seq(from=from_to[1], to=from_to[2], length=100)
values<-dnorm(mean=mean(diff_price_BTC_ETH_last), sd=sd(diff_price_BTC_ETH_last), arguments)
points(arguments, values, col="red", type="l")

7
#######################################
#
# f)
#
#######################################
mean(diff_price_BTC_ETH_last[diff_price_BTC_ETH_last>quantile(diff_price_BTC_ETH_last, probs=0.9

minute_history<-history[index, ]
time_stamps <- as.POSIXct(minute_history[["TIME"]],origin = ’1970-01-01’, tz = ’GMT’);

################################################################################
#
# Problem 3
#
#################################################################################
#######################################
#
# a)
#
#######################################
set.seed(10) #initialize pseudo random generator for reproducibility
n<-3000; p=100
basis_functions<-list(sin, cos, dnorm, pnorm)
arguments<-seq(from=0, to=1, length=p)
function_values<-matrix(data=0, nrow=length(basis_functions), ncol=length(arguments))
for (i in 1:length(basis_functions)) function_values[i,]<-basis_functions[[i]](arguments)
random_coefficients<-matrix(data=0, nrow=n, ncol=length(basis_functions))
random_coefficients[,]<-rnorm(nrow(random_coefficients)*ncol(random_coefficients), mean=3,
X<-random_coefficients%*%function_values
#
plot(x=arguments, y=X[1,], type="l", ylim=c(min(X[1:20, ]), max(X[1:20, ]))) # plot some functio
for (i in (2:50)) points(x=arguments, y=X[i,], type="l")
#######################################
#
# b)
#
#######################################
EVD<-eigen(t(X)%*%X)
plot(x=arguments,y=EVD$vectors[,1], type="l", col="black", ylim=c(min(EVD$vectors[,1:3]), max(EV
points(x=arguments,y=EVD$vectors[,2], type="l")
points(x=arguments,y=EVD$vectors[,3], type="l")

8
#######################################
#
# c)
#
#######################################
x<-X[1,]
b<-EVD$vectors[,1:1]
hatx<-b%*%(t(b)%*%x)
plot(x=arguments,y=x, type="l", col="black", ylim=c(min(X[1:5, ]), max(X[1:5, ])))
points(x=arguments,y=hatx, type="l", col="red")
#
b<-EVD$vectors[,1:2]
hatx<-b%*%(t(b)%*%x)
points(x=arguments,y=hatx, type="l", col="red")
#
b<-EVD$vectors[,1:3]
hatx<-b%*%(t(b)%*%x)
points(x=arguments,y=hatx, type="l", col="red")

9
Solutions Assignment Part I
Statistics and Financial Econometrics
Autumn 2017

SOLUTION 1 a) Not independent, see b)


b) For (i, j) ∈ {0, 1}2 it holds that

P((Y6 , Y7 ) = (i, j)) = P((Y6 , Y7 ) = (i, j)|X = 1)P(X = 1) + P((Y6 , Y7 ) = (i, j)|X = 2)P(X = 2)

That is
$ % $ % $ %
1 1 3 1 7 5
1 1
(P((Y6 , Y7 ) = (i, j))0i,j=1 = 4
1
4
1
· + 16
9
16
3
· = 32
13
32
7
4 4
2 16 16
2 32 32

This distribution does not factorize in the product of its marginals. Hence Y6 and Y7 are not
independent.
c) It holds that
& ' & '
1 4 3 3 1 4 1 4
P(X = 1) = = P(X = 2), P(A|X = 2) = ( ) ( ), P(A|X = 1) = ( )
2 1 4 4 1 2
Thus Bayes rule suggests that
( 34 )3 ( 41 )
P(X = 2|A) = = 27/43 = 0.627907,
( 34 )3 ( 14 ) + ( 12 )4
( 21 )4
P(X = 1|A) = = 16/43 = 0.372093,
3 3 1
( 4 ) ( 4 ) + ( 12 )4
d)

P(Y6 = 1|A) = P(Y6 = 1, X = 1|A) + P(Y6 = 1, X = 2|A)

Now
P({Y6 = 1} ∩ {X = 1} ∩ A)
P(Y6 = 1, X = 1|A) =
P(A)
P({Y6 = 1} ∩ {X = 1} ∩ A) P({X = 1} ∩ A)
=
P({X = 1} ∩ A) P(A)
P({Y6 = 1} ∩ {X = 1}) P({X = 1} ∩ A)
=
P({X = 1}) P(A)
= P(Y6 = 1|X = 1)P(X = 1|A)

Similarly, we obtain

P(Y6 = 1, X = 2|A) = P(Y6 = 1|X = 2)P(X = 2|A)

which gives the desired probability

P(Y6 = 1|A) = P(Y6 = 1|X = 2)P(X = 2|A) + P(Y6 = 1|X = 1)P(X = 1|A)
3 ( 34 )3 ( 14 ) 1 ( 21 )4
= +
4 ( 34 )3 ( 14 ) + ( 12 )4 2 ( 34 )3 ( 14 ) + ( 12 )4
3 27 1 16
= · + · = 113/172
4 43 2 43

10
Similarly,

P(Y6 = 1|B) = 7/10, P(Y6 = 0|B) = 3/10,


P(Y6 = 1|A ∩ B) = 4/6, P(Y6 = 0|A ∩ B) = 2/6,

SOLUTION 2
Consider the random variable S with values in {1, . . . , 5} which represents the choice of the
server. Let R be the event that the server rejected. We have
i 1
P(R|S = i) = , P(S = i) = , i = 1, . . . 5
5 5
That is Bayes formula gives
i
P(S = i|R) = #55 j
, i = 1, . . . , 5
j=1 5

a) Similarly, for the event R2 that the chosen server rejected twice we have
i 1
P(R2 |S = i) = ( )2 , P(S = i) = , i = 1, . . . 5
5 5
That is Bayes formula gives
( i )2
P(S = i|R2 ) = #5 5 j , i = 1, . . . , 5
2
j=1 ( 5 )

Consider the event T standing for successful response after the request was sent to another server.
We have
5
!
P(T |R2 ) = P(T ∩ {S = i}|R2 )
i=1
!5
P(T ∩ {S = i} ∩ R2 )
=
P(R2 )
i=1
5
! P(T ∩ {S = i} ∩ R2 ) P({S = i} ∩ R)
=
P({S = i} ∩ R2 ) P(R2 )
i=1
5
! P(T ∩ {S = i}) P({S = i} ∩ R2 )
=
P({S = i}) P(R2 )
i=1
5
!
= P(T |S = i)P(S = i|R2 )
i=1

where the conditioned probabilities are obtained as


! j
P(T |S = i) = (1 − )/4, i = 1, . . . , 5.
5
j∈{1,...,5}\{i}

> u<-(1:5)/5
> p<-u^2/sum(u^2)
> t<- (sum(1-u) -(1-u))/4
> sum(t*p)
[1] 0.4545455

11
b) As in the above argumentation, the probability that the server i was chosen. under condition
that it rejected the request is
i
pi = P(R|S = i) = #55 j i = 1, . . . 5
j=1 5

After the server i is removed the request is sent to one of the remaining four servers. Under
the condition that the first server was i ∈ {1, . . . , 5} and the second request was rejected, the
probability that the second server is j ∈ {1, . . . , 5} \ {i} equals to
j
5
qi,j = # k
, j ∈ {1, . . . , 5} \ {i}
k∈{1,...,5}\{i} 5

The probability that the second request will be processed successfully under the condition that
the serves i, j are chosen in the previous steps is
! k
li,j = (1 − )/3,
5
k∈{1,...,5}\{i,j}

Finally, the desired probability is


5
! !
pi qi,j li,j
i=1 j∈{1,...,5}\{i}

> u<-(1:5)/5
> p<-u/sum(u)
> q<-matrix(data=u, byrow = TRUE, ncol=5, nrow=5)
> d<-matrix(data=sum(u)-u, byrow=FALSE, ncol=5, nrow=5)
> q<-q/d
> diag(q)<-0
> l<-matrix(data=sum(1-u), byrow=TRUE, nrow=5, ncol=5)
> s1<-matrix(data=1-u, byrow=TRUE, nrow=5, ncol=5)
> s2<-matrix(data=1-u, byrow=FALSE, nrow=5, ncol=5)
> l<-(l-s1-s2)/3
> p%*%( (q*l)%*%rep(1, 5) )
[,1]
[1,] 0.4772672

SOLUTION 3
Le X is be the time of occurrence of the r = 3 red card, then:
a) This is a negative binomial distribution with probability p = 12 and success number r = 3.
& '
i−1 1 i
P(X = i) = ( ) i = 3, 4, . . .
r−1 2

b) This must be modeled using Hypergeometric distribution. Since at the step i − 1 we must have

12
r − 1 red card among chosen i − 1 card followed by one red card, the probability is
( R )( N −R )
r−1 i−1−(r−1) R − (r − 1)
P(X = i) = (N) i = 3, 4, . . .
i−1
N − (i − 1)

with N = 52, R = 26 and r = 3 for i = 3, . . . , N − R, . . . N − R − r .


SOLUTION 4
a) Y follows multinomial distribution M (5, ( 13 , 13 , 31 )).
b) The calculation with λ = 5 and p = 1/3 shows that

!
P(X1 = k) = P(X1 = k | Y = j)
j=k
!∞ & '
j k λj
= p (1 − p)j−k e−λ
k j!
j=k
!∞ & '
j+k k λj+k −λ
= p (1 − p)j e
k (j + k)!
j=0

! (j + k)! 1
= (λp)k (λ(1 − p))j e−λ
j!k! (j + k)!
j=0

(λp)k −λp
= e
k!

shows that X1 follows a Poisson distribution with parameter λp = 5/3


c) Using above argumentation, we obtain

P(Xi = 0) = e−λp , P({Xi = 0} ∩ {Xj = 0}) = e−λ2p , i '= j

which gives unconditioned probabilities


3
! 3
!
P(∪3i=1 {Xi = 0}) = P({Xi = 0}) − P({Xi = 0} ∩ {Xj = 0}) + P(∩3i=1 {Xi = 0})
* +, -
i=1 i%=j
P(Y =0)
−5· 13 −5· 32
= 3e − 3e + e−5

SOLUTION 5
a) Let D = B ∪ C with the probability P(D) = 1/2 + 1/2 − 1/4 = 3/4 = d. Denote by
0.1 = P(A \ D) = a and introduce s = P(A ∩ D). Since A and D are independent, we have

ad
P(A ∩ D) = P(A)P(D) ⇒ s = (a + s)d ⇒ s = = 3/10
1−d
That is P(A) = a + s = 4/10 giving
4 1 1 1
P(A ∩ B ∩ C) = P(A)P(B)P(C) = · · =
10 2 2 10
5
b) Define P(A ∪ B) = 9 = r and P(A) = p then P(B \ A) = r − p and s = P(A ∩ B) satisfies

(r − p)p
s = ((r − p) + s)p ⇒s=
1−p

13
(r−p)p
we need to determine p ∈ [0, r] which maximizes p )→ s(p) = 1−p on the interval [0, r]. Having
calculated the derivative, we obtain first-order conditions
(r − 2p)(1 − p) + (r − p)p
=0
(1 − p)2
giving quadratic equation
r − 2p + p2 = 0
whose unique solution in the interval [0, r] is given by

p=1− 1−r

In our case r = 5/9, we obtain p = 1/3 and s = 1/9, thus P(B) = r − p + s = 1/3, that is the
maximal possible value of P(A ∩ B) is 1/9 which is achieved with independent sets A B with
equal probabilities P(A) = P(B) = 13 .

SOLUTION 6 a) The density of the variable (U, V ) is given by


u2
e− 2
fU,V (u, v) = √ e−v 1[0,∞[ (u), (u, v) ∈ R2

Using transformation

h : R2 → R2 , (u, v) )→ (h1 (u, v), h2 (u, v)) = (u, u + v).

we obtain the inverse transformations

h−1
1 (x, y) = x, h−1
2 (x, y) = y − x

and the Jacobian $ % $ %


∂1 h−1 −1
1 (x, y) ∂2 h1 (x, y) 1 0
J(x, u) = =
∂1 h−1 −1
2 (x, y) ∂2 h2 (x, y) −1 1
whose determinant is one
det J(x, u) = 1, (x, y) ∈ R2 .
This yields the density

fX,Y (x, y) = fU,V (h−1 (x, y)) det J(x, u)


x2
e− 2
= √ e−(y−x) 1[0,∞[ (y − x)

x2 −2x
e− 2
= √ e−y 1[x,∞[ (y)

b) The marginal densities are obtained by
. x2
e− 2 −(y−x)
fX (x) = √ e 1[0,∞[ (y − x)dy
R 2π
x2 .
e− 2
= √ e−(y−x) 1[0,∞[ (y − x)dy
2π R
x2
e− 2
= √

14
and by
. x2 −2x
e− 2
fY (y) = √ e−y 1[x,∞[ (y)dx
R 2π
. y x2 −2x . x2 −2x+1
e− 2 1
y
e− 2
= e−y √ dx = e−y+ 2 √ dx
−∞ 2π −∞ 2π
1 1
= e−y+ 2 N (1, 1)(] − ∞, y]) = e−y+ 2 Φ(y − 1)

Check numerically whether it is a density function:

f<-function(y)
{
exp(-y+1/2)*pnorm(mean=1, sd=1, q=y)
}

integrate(f, -55, 55)


1 with absolute error < 1.2e-07

15

You might also like