0% found this document useful (0 votes)
424 views

STAT 333 Assignment 1 Solutions

The document provides solutions to several probability problems involving coin flips, negative binomial distributions, geometric distributions, and other concepts. In problem 1, the expected number and variance of coin flip changeovers is calculated. Problem 2 proves properties of the negative binomial distribution. Problem 3 examines properties of a geometric random variable. Problem 4 finds probabilities related to uniform random variables. Problem 5 calculates expected total wins in two gambling scenarios.

Uploaded by

liquidblackout
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
424 views

STAT 333 Assignment 1 Solutions

The document provides solutions to several probability problems involving coin flips, negative binomial distributions, geometric distributions, and other concepts. In problem 1, the expected number and variance of coin flip changeovers is calculated. Problem 2 proves properties of the negative binomial distribution. Problem 3 examines properties of a geometric random variable. Problem 4 finds probabilities related to uniform random variables. Problem 5 calculates expected total wins in two gambling scenarios.

Uploaded by

liquidblackout
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

STAT 333 Assignment 1 SOLUTIONS

1. Consider 10 independent coin flips having probability p of landing heads. a. Find the expected number of changeovers. Let Xi = 1 if there is a changeover between trial i and trial i+1, 0 otherwise, i = 1, 2, , 9. Then X, the number of changeovers is X1 + X2 + + X9 E[Xi] = P(Changeover) = P(HT or TH) = p(1 p) + (1 p)p = 2(p p2), i = 1, 2, , 9. So E[X] = E[X1 + X2 ++ X9] = E[X1] + E[X2] + + E[X9] = 9[2(p p2)] = 18(p p2) b. Find the variance of the number of changeovers. Var(Xi) = P(Changeover)(1 P(Changeover)) = 2(p p2) (1 2(p p2)) = 2( 2p4 + 4p3 3p2 + p), i = 1, 2, , 9. Cov(Xi, Xj) will be 0 unless i and j are adjacent. We want i < j so let j = i + 1. E[XiXi+1] = P(2 changeovers) = P(HTH or THT) = p(1 p)p + (1 p)p(1 p) = p p2 Cov(Xi, Xi+1) = E[XiXi+1] E[Xi]E[Xi+1] = (p p2) (2(p p2))2 = 4p4 + 8p3 5p2 + p, i = 1, 2, , 8. So Var(X) = Var(Xi) + 2 Cov(Xi, Xi+1) = 9[2( 2p4 + 4p3 3p2 + p)] + 2*8[ 4p4 + 8p3 5p2 + p] = 100p4 + 200p3 134p2 + 34p c. Describe how the mean and variance of the number of changeovers behave for different values of p. Provide a brief logical explanation. Using Excel to plot E[X] and Var(X) for different values of p yields the following result:
5 4 3 2 1 0
0.01 0.05 0.09 0.13 0.17 0.21 0.25 0.29 0.33 0.37 0.41 0.45 0.49 0.53 0.57 0.61 0.65 0.69 0.73 0.77 0.81 0.85 0.89 0.93
E[X] Var(X)

Explanation: the expected value makes sense because with a higher likelihood of getting either all T or all H, we expect fewer changeovers. For the variance, as p goes to 0 or 1, the number of changeovers approaches 0 with certainty, and hence the variance approaches 0 too. In the middle, things are a little weird: variance is highest when p is around 0.78 and 0.22 and dips lower when p is closer to 0.5. It makes sense if you remember that variance is the average squared distance from the mean. When p is close to 0.5, the mean is 4.5, and the farthest away the actual number of changeovers could possibly be is 0 or 9 (a distance of 4.5). When p is slightly higher or lower, the mean drifts down, so the possibility of being really far away (e.g. 9 when the mean is 3, a distance of 6) drags the variance up.

0.97

2. Consider a Negative Binomial random variable Y ~ NB(r, p). Prove that Y is a proper rv iff p > 0 by the following methods: a. Express Y as the sum of r independent Geometric random variables, and apply a known result about Geometrics. Y = X1 + X2 + + Xr, where each Xi ~ Geo(p) P(Y < ) = P(X1 + X2 + + Xr < ) = P(X1 < , X2 < , Xr < ) for the sum to be finite, each Xi must be finite = P(X1 < )P(X2 < )P(Xr < ) by independence =1*1**1 since Geometrics are proper iff p > 0 =1 So the Negative Binomial is proper. b. Show that P(Y = ) = 0. For Y = , we would need somewhere before the rth S, an infinite sequence of Fs. P(Y = ) = P({any combination with up to r-1 Ss},F, F, F, F, .) = P({any such combination})P(F, F, F, ) by independence = c*(1-p)(1-p)(1-p) = 0 iff p > 0. Alternative Solution: P(Y = ) =1P(Y < ) = 1y=1 P(Y = y). Show that the sum is 1. c. Show (from first principles!) that E[Y] is r/p. Why does this imply Y is proper? First lets find the mean of a Geometric. E[X] = p + 2p(1-p) + 3p(1-p)2 + 4p(1-p)3 + = p{1 + (1-p) + (1-p)2 + (1-p)3 + + (1-p) + (1-p)2 + (1-p)3 + + (1-p)2 + (1-p)3 + } = p {1/p + (1-p)/p + (1-p)2/p + = p/p *1/(1-(1-p)) = 1/p (There are other tricks for finding the mean of a Geometric using a derivative trick, or multiplying the entire series through by (1-p) and subtracting) And since Y = X1 + X2 + + Xr, E[Y] = E[X1] + E[X2] + + E[Xr] = r E[Xi] = r/p. Alternative Solution: Work directly from the pmf of Y to show y=1 yP(Y = y) = r/p Since this is finite iff p > 0, this means Y must be proper, since if Y were improper, the mean would be infinite.

3. Suppose we have a series of independent trials, where the outcome is either S or F, but the probability of S on trial n is pn, not necessarily constant. Let X = number of trials until the first S, including that trial. a. Give an expression for the probability mass function, P(X = k). P(X = k) = P({F, F, , F, S}) (k 1 Fs followed by S) = P(F on trial 1)P(F on 2)P(F on k 1)P(S on k) since trials are independent = (1 p1)(1 p2)(1 pk1)pk, =

(1 p ) pk
n1 n

k 1

for k = 1, 2, 3,

b. Give an expression for P(X = ). P(X = ) = P({F, F, F, F, .}) infinite series of Fs = P(F on trial 1)P(F on 2)P(F on 3)P(F on 4). since trials are independent = (1 p1)(1 p2)(1 p3)(1 p4). =

(1 p )
n1 n

c. Prove that if pn = an, where 0 < a < , then

(1 p ) 1 a a2 ak
n1 n

Base case(s) [only one actually needed] k = 1: 1 a 1 a k = 2: (1 a)(1 a2) = 1 a (1 a)a2 > 1 a a2 2 k = 3: (1 a)(1 a )(1 a3) > (1 a a2)(1 a3) = 1 a a2 (1 a a2)a3 > 1 a a2 a3 Inductive Step
m

since (1 a) < 1 applying result from n = 2 since (1 a a2) < 1 if a <

Assume true for k = m. That is, assume Now for k = m + 1.

(1 a ) 1 a a2 am
n n1

(1 an ) =
n1

m1

(1 a ) (1 am+1)
n n1

> (1 a a2 am)(1 am+1) applying the Induction Hypothesis 2 m 2 = 1 a a a (1 a a am)am+1 > 1 a a2 am am+1 since (1 a am) < 1 if a < So the inequality holds for k = m+1, and thus holds for all positive integers k by the principle of mathematical induction.

d. Using the general result in c, when we take the limit as k goes to , what happens to

(1 pn ) if
n1

pn diverges? What if
n 1

p
n 1

converges to something < 1?

We showed that If

(1 pn ) 1 pn . So as k approaches ,
n1 n 1

(1 pn ) 1 pn .
n1 n 1

p
n 1

diverges,

(1 p )
n1 n
n1

But recall these pns are probabilities, so the only thing we know is (1 pn ) 0. (It actually turns out that

(1 p ) = 0 in this case!)
n1 n n1

On the other hand, if

pn converges to a value < 1 (say b), then (1 pn ) 1 b > 0


n 1

(This is a strict inequality, rather than simply knowing the product is non-negative.) See the new handout for a summary of this rule. e. Back to our trials. Suppose pn = 3n. Determine using the results in c and d whether X is proper or improper by finding P(X = ). P(X = ) = =

(1 p )
(1 3
n1

n1

from b in this case from c and d, where a = 31 = 1/3

1 =1

3
n 1

1/ 3 = 0.5 11/ 3

The probability of X equaling is at least 0.5, which is clearly greater than 0. So X is an improper random variable.

4. Suppose X1, X2, X3, and Y are independent continuous uniform random variables over the unit interval (0,1). a. Find P(Xi < Y). quick method: Since Xi and Y are i.i.d. random variables, then, by symmetry, each is equally likely to be the larger of the two. Therefore P(X i < Y ) = P(Xi > Y) = 1/2. longer method (by conditioning on Y ): P(Xi < Y) = 01 P(Xi < Y | Y = y)dy since the pdf f(y) = 1 for U(0,1) 1 = 0 P(Xi < y | Y = y)dy substitution = 01 P(Xi < y)dy since Xi and Y are indep 1 = 0 y dy since Xi is U(0,1) and the cdf F(y) = y = 1/2 b. Find P(Xi < Y AND Xj < Y) for i j. quick method: Again by symmetry, since each of Xi, Xj, and Y is equally likely to be the largest of the three, it follows that P(Xi < Y AND Xj < Y) = 1/3. longer method (by conditioning on Y ): P(Xi < Y AND Xj < Y) = 01 P(Xi < Y AND Xj < Y | Y = y)dy since the pdf f(y) = 1 1 = 0 P(Xi < y AND Xj < y | Y = y)dy substitution = 01 P(Xi < y AND Xj < y)dy Xi and Xj indep of Y 1 = 0 P(Xi < y)P(Xj < y)dy Xi and Xj indep of each other = 01 y2 dy since Xi and Xj are U(0,1) = 1/3 Note: even though the four variables are independent, the events X i < Y and Xj < Y are NOT independent, since they both depend on Y. So we CANNOT say its * = c. Use the above to evaluate Var(Z), where Z is the number of Xis that are < Y We can express Z as the sum of three indicators, one for each of the X is being < Y Z = I1 + I 2 + I3 Var(Ii) = P(Xi < Y)(1 P(Xi < Y)) = 1/2(1 1/2) from a = 1/4 Cov(Ii, Ij) = P(Xi < Y AND Xj < Y) P(Xi < Y)P(Xj < Y) = 1/3 (1/2)2 from b and a = 1/12 So Var(Z) = Var(I1) + Var(I2) + Var(I3) + 2 Cov(Ii, Ij) = 3Var(Ii) + 6Cov(Ii, Ij) = 3(1/4) + 6(1/12) = 5/4 This can also be done using symmetry arguments. The position of Y when the variables are ordered smallest to largest is equally likely to be smallest (in which case Z = 0), second-smallest (Z = 1), second-largest (Z = 2) or largest (Z = 3). E[Z] = 1/4(0 + 1 + 2 + 3) = 6/4 E[Z2] = 1/4(02 + 12 + 22 + 32) = 14/4 So Var(Z) = 14/4 (6/4)2 = 5/4

5. A gambler wins each game with probability p. Find the expected total number of wins if a. The gambler plays n games. If he/she wins X of these games, then he/she will play an additional X games, and then stop. Let W = total number of wins. E[W|X=x] = x + px since we know they win x of the first n games, and they will on average win px of the additional x games So E[W|X] = (1+p)X E[W] = E[E[W|X]] by double averaging = E[(1+p)X] = (1+p)E[X] = (1+p)np since X ~ Bin(n, p) with mean np b. The gambler plays until he/she wins once. If it takes him/her Y games to get this win, he/she will play an additional Y games, and then stop. Again let W = total number of wins. E[W|Y=y] = 1 + py since we know they win 1 of the first y games, and they will on average win py of the additional x games So E[W|Y] = 1 +pY E[W] = E[E[W|Y]] = E[1+pY] = 1+pE[Y] = 1+p(1/p) =2 by double averaging

since Y ~ Geo(p) with mean 1/p

You might also like