0% found this document useful (0 votes)
1 views

Discrete Color

The document is a lecture on discrete random variables, part of an Introduction to Stochastic Processes course. It covers definitions, properties, and examples of discrete random variables, including their probability mass functions and common distributions like Bernoulli and binomial distributions. The lecture emphasizes the importance of understanding discrete random variables for computing probabilities in various scenarios.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Discrete Color

The document is a lecture on discrete random variables, part of an Introduction to Stochastic Processes course. It covers definitions, properties, and examples of discrete random variables, including their probability mass functions and common distributions like Bernoulli and binomial distributions. The lecture emphasizes the importance of understanding discrete random variables for computing probabilities in various scenarios.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Lecture 1: Discrete random variables 1 of 15

Course: Introduction to Stochastic Processes


Term: Fall 2019
Instructor: Gordan Žitković

Lecture 1
Discrete random variables

1.1 Random Variables


A large chunk of probability is about random variables. Instead of giving a
precise definition, let us just mention that a random variable can be thought
of as an uncertain quantity (usually numerical, i.e., with values in the set of
real numbers R, but not always).
While it is true that we do not know with certainty what value a random
variable X will take, we usually know how to assign a number - the proba-
bility - that its value will be in some1 subset of R. For example, we might be
interested in P[ X ≥ 7], P[ X ∈ [2, 3.1]] or P[ X ∈ {1, 2, 3}].
Random variables are usually divided into discrete and continuous, even
though there exist random variables which are neither discrete nor continu-
ous. Those can be safely neglected for the purposes of this course, but they
play an important role in many areas of probability and statistics.

1.2 Discrete random variables


Before we define discrete random variables, we need some vocabulary.

Definition 1.2.1. Given a set B, we say that the random variable X is


B-valued if P[ X ∈ B] = 1.

In words, X is B-valued if we know for a fact that X will never take a


value outside of B.

Definition 1.2.2. A random variable is said to be discrete if there exists


a set S such that S is either finite or countablea and X is S -valued.
a Countable means that its elements can be enumerated by the natural numbers. The

only (infinite) countable sets we will need are N = {1, 2, . . . , } or N0 = {0, 1, 2, . . . }.

1 We will not worry about measurability and similar subtleties in this class.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 2 of 15

Definition 1.2.3. The support S X of the discrete random variable X is


the smallest set S such that X is S -valued.

Example 1.2.4. A die is thrown and the number obtained is recorded and
denoted by X. The possible values of X are {1, 2, 3, 4, 5, 6} and each
happens with probability 1/6, so X is certainly S -valued. Since S is
finite, X is discrete.
One still needs to argue that S is the support S X of X. The alternative
would be that S X is a proper subset of S , i.e., that there are redundand
elements in S . This is not the case since all elements in S are “impor-
tant”, i.e., happen with positive probability. If we remove anything
from S , we are omitting a possible value for X.
On the other hand, it is certainly true that X always takes its values in
the finite set S 0 = {1, 2, 3, 4, 5, 6, 7}, i.e., that X is S 0 -valued. One has
to be careful with the terminology here: it is correct to say that X is
an S 0 -valued (or even N-valued) random variable, even though it only
takes the values 1, 2, . . . , 6 with positive probabilities.

Discrete random variables are very nice due to the following fact: in or-
der to be able to compute any conceivable probability involving a discrete
random variable X, it is enough to know how to compute the probabili-
ties P[ X = x ], for all x ∈ S . Indeed, if you are interested in figuring out
what P[ X ∈ B] is, for some set B ⊆ R (e.g., B = {5, 6, 7}, B = [3, 6], or
B = [−2, ∞)), we simply pick all x ∈ S X which are also in B and sum their
probabilities. In mathematical notation, we have
P[ X ∈ B ] = ∑ P[ X = x ]. (1.2.1)
x ∈S X ∩ B

Definition 1.2.5. The probability mass function (pmf) of a discrete


random variable X is a function p X defined on the support S X of X by

p X ( x ) = P[ X = x ], x ∈ S X .

In practice, we usually present the pmf p X in the form of a table (called


the distribution table) as
x x1 x2 x3 ...
X∼
pX (x) p1 p2 p3 ...
or, simply,
x1 x2 x3 ...
X∼ ,
p1 p2 p3 ...

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 3 of 15

where the top row lists all the elements x of the support S X of X, and the
bottom row lists their probabilities p X ( x ) = P[ X = x ]. It is easy to see that
the function p X has the following properties:
1. p X ( x ) ∈ [0, 1] for all x, and
2. ∑ x∈SX p X ( x ) = 1.
Here is a first round of examples of discrete random variables and their
supports.

Example 1.2.6.

1. A fair (unbiased) coin is tossed and the value observed is denoted by X.


Since the only possible values X can take are H or T, and the set
S = { H, T } is clearly finite, X is a discrete random variable. Its
distribution is given by the following table:

x H T
pX (x) 1/2 1/2

Both H and T are possible (happen with probability 1/2), so no


smaller set S will have the property that P[ X ∈ S] = 1. Conse-
quently, the support S X of X is S = { H, T }.
2. A die is thrown and the number obtained is recorded and denoted by X.
The possible values of X are {1, 2, 3, 4, 5, 6} and each happens with
probability 1/6, so X is discrete with support S X . Its distribution is
given by the table

x 1 2 3 4 5 6
pX (x) 1/6 1/6 1/6 1/6 1/6 1/6

3. A fair coin is thrown repeatedly until the first H is observed; the number
of Ts observed before that is denoted by X. In this case we know that
X can take any of the values N0 = {0, 1, 2, . . . } and that there is
no finite upper bound for it. Nevertheless, we know that X cannot
take values that are not non-negative integers. Therefore, X is N0 -
valued and, in fact, S X = N0 is its support. Indeed, we have
P[ X = x ] = 2− x−1 , for x ∈ N0 , i.e.,

x 0 1 2 ...
X∼
pX (x) 1/2 1/4 1/8 ...

4. A card is drawn randomly form a standard deck, and the result is de-
noted by X. This example is similar to the 2., above, since X takes
one of finitely many values, and all values are equally likely. The

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 4 of 15

difference is that the result is not a number anymore. The set


S of all possible values can be represented as the set of all pairs
like (♠, 7), where the first entry denotes the picked card’s suit (in
{♥, ♠, ♣, ♦}), and the second is a number between 1 and 13. It
is, of course, possible to use different conventions and use the set
{2, 3, . . . , 9, 10, J, Q, K, A} for the second component. The point is
that the values X takes are not numbers.

1.3 Events and Bernoulli random variables


Random variables X which can only take one of two values 0, 1, i.e., for
which S X ⊆ {0, 1}, are called indicators or Bernoulli random variables and
are very useful in probability and statistics (and elsewhere). The name comes
from the fact that you should think of such variables as signal lights; if X = 1
an event of interest has happened, and if X = 0 it has not happened. In other
words, X indicates the occurence of an event.
One reason the Bernoulli random variables are so useful is that they let
us manipulate events without ever leaving the language of random variables.
Here is an example:

Example 1.3.1. Suppose that two dice are thrown so that X1 and X2 are
the numbers obtained (both X1 and X2 are discrete random variables
with S X1 = S X2 = {1, 2, 3, 4, 5, 6}). If we are interested in the probabil-
ity that their sum is at least 9, we proceed as follows. We define the
random variable W - the sum of X1 and X2 - by W = X1 + X2 . An-
other random variable, let us call it X, is a Bernoulli random variable
defined by (
1, W ≥ 9,
X=
0, W < 9.
With such a set-up, X signals whether the event of interest has hap-
pened, and we can state our original problem in terms of X, namely
“Compute P[ X = 1] !”.

This example is, admittedly, a little contrived. The point, however, is that
anything can be phrased in terms of random variables; thus, if you know how
to work with random variables, i.e., know how to compute their distributions,
you can solve any problem in probability that comes your way.
Another reason Bernoulli random variables are useful is the fact that we
can do arithmetic with them.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 5 of 15

Example 1.3.2. 70 coins are tossed and their outcomes are denoted by
W1 , W2 , . . . , W70 . All Wi are random variables with values in { H, T }
(and therefore not Bernoulli random variables), but they can be easily
recoded into Bernoulli random variables as follows:
(
1, if Wi = H,
Xi =
0, if Wi = T.

Once you have the “dictionary” {1 ↔ H, 0 ↔ T }, random variables Xi


and Wi carry exactly the same information. The advantage of using Xi
is that the random variable
70
N= ∑ Xi ,
i =1

which takes values in S N = {0, 1, 2, . . . , 70} counts the number of


heads among W1 , . . . , W70 . Similarly, the random variable

M = X1 × X2 × · · · × X70

is a Bernoulli random variable itself. What event does it indicate?

1.4 Some widely used discrete random variables


The distribution of a random variable is sometimes defined as “the collection
of all possible probabilities associated to it”. This sounds a bit abstract, and,
at least in the discrete case, obscures the practical significance of this impor-
tant concept. We have learned that for discrete variables the knowledge of the
pmf or the distribution table (such as the one in part 1., 2. or 3. of Example
1.2.6) amounts to the knowledge of the whole distribution. It turns out that
many random variables in widely different contexts come with the same (or
similar) distribution tables, and that some of those appear so often that they
deserve to be named (so that we don’t have to write the distribution table
every time). The following example lists some of those, named, distribution.
There are many others, but we will not need them in these notes.

Example 1.4.1.
1. Bernoulli distribution. We have already encountered this distri-
bution in our discussion of indicator random variables above. It is
characterized by the distribution table of the form
0 1
, (1.4.1)
1− p p

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 6 of 15

where p can be any number in (0, 1). Strictly speaking, each value
of p defines a different distribution, so it would be more correct to
speak of a parametric family of distributions, with p ∈ (0, 1) being
the parameter.
In order not to write down the table (1.4.1) every time, we also use
the notation X ∼ B( p). For example, the Bernoulli random variable
which takes the value 1 when a fair coin falls H and 0 when it falls
T has a B(1/2)-distribution.
An experiment (random occurrence) which can end in two possible
ways (usually called success and failure, even though those names
should not always be taken literally) is often called a Bernoulli
trial. If we “encode” success as 1 and failure by 0, each Bernoulli
trial gives rise to a Bernoulli random variable.
2. Binomial distribution. A random variable whose distribution ta-
ble looks like this
0 1 ... ( n − 1) n
qn (n1 ) pqn−1 ... n
(n− 1) qp
n −1 pn
for some n ∈ N, p ∈ (0, 1) and q = 1 − p, is called the binomial
distribution, usually denoted by b(n, p). Remember that the bino-
mial coefficient (nk) is given by
 
n n!
= where n! = n(n − 1)(n − 2) · . . . · 2 · 1.
k k!(n − k)!
Binomial distribution(s) form a parametric family with two param-
eters n ∈ N and p ∈ (0, 1), and each pair (n, p) corresponds to a
different binomial distribution.





● ●

● ●
● ● ● ● ● ● ● ● ● y
0 1 2 n

Figure 1. The probability mass function (pmf) of a typical binomial


distribution.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 7 of 15

Recall that the binomial distribution arises as the “number of suc-


cesses in n independent Bernoulli trials”, i.e., it counts the number
of H in n independent tosses of a biased coin whose probability of
H is p.
3. Geometric distribution. The geometric distribution is similar to
the binomial in that it is built out of the sequence of “successes”
and “failures” in independent, repeated Bernoulli trials. The dif-
ference is that the number of trials is no longer fixed (i.e., = n), but
we keep tossing until we get our first success. Since the trials are
independent, if the probability of success in each trial is p ∈ (0, 1),
the probability that it will take exactly k failures before the first
success is qk p, where q = 1 − p. Therefore, so the geometric distri-
bution - denoted by g( p) - comes with the following table

0 1 2 3 ...
.
p qp q2 p q3 p ...




● ● ● ● ● ● ● y
0 1 2

Figure 2. The probability mass function (pmf) of a typical geometric


distribution.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 8 of 15

Caveat: When defining the geometric distribution, some


people count the number of trials to the first success, i.e.,
add the final success into the count. This shifts everything
by 1 and leads to a distribution with support N (and not
N0 ). While this is no big deal, this ambiguity tends to be
confusing at times and leads to bugs in software. For us, the
geometric distribution will always start from 0. The distri-
bution which counts the final success will be referred to as
the shifted geometric distribution, but we’ll try to avoid it
altogether.

4. Poisson distribution. This is also a family of distributions, param-


eterized by a single parameter λ > 0, and denoted by P(λ). Its
support is N0 and the distribution table is given by

0 1 2 3 4 ...
2 3 4 .
e−λ e−λ λ e−λ λ2 e−λ λ3! e−λ λ4! ...

The closed form for the pmf is


x
p X ( x ) = e−λ λx! , x ∈ N.

The Poisson distribution arises as a limit when n → ∞ and p → 0


while np ∼ λ in the Binomial distribution.

● ●




● ●
● ● ● ● ● ● ● ● ● ● ● y
0 1 2

Figure 3. The probability mass function (pmf) of a typical Poisson


distribution with λ > 1.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 9 of 15

1.5 Expectations and standard deviations


Expectations and standard deviations provide summaries of numerical ran-
dom variables - they give us some information about them without over-
whelming us with the entire distribution table. The expectation can be thought
of as a center of the distribution, while the standard deviation gives you an
idea about its spread2 .

Definition 1.5.1. For a discrete random variable X with support S X ⊆


R, we define the expectation E[ X ] of X by

E[ X ] = ∑ x p X ( x ), (1.5.1)
x ∈S X

if the (possibly) infinite sum ∑ x∈S x p X ( x ) absolutely converges, i.e., as


long as

∑ | x | p X ( x ) < ∞. (1.5.2)
x ∈S X

When the sum in (1.5.2) above diverges (i.e., takes the value +∞), we
say that the expectation of X is not defined.

Perhaps the most important property of the expectation is its linearity:

Theorem 1.5.2. If E[ X ] and E[Y ] are both defined then so is E[αX + βY ],


for any two constants α, β. Moreover,

E[αX + βY ] = αE[ X ] + βE[Y ].

In order to define the standard deviation, we first need to define the vari-
ance. Like the expectation, the variance may or may not be defined (depend-
ing on whether the sums used to compute it converge absolutely or not).
Since we will be working only with distributions for which the existence of
expectation(s) is never a problem, we do not mention this issue in the sequel.

Definition 1.5.3. The variance of the random variable X is


Var[ X ] = E[( X − µ X )2 ] = ∑ ( x − µ X )2 p X ( x ) where µ X = E[ X ].
x ∈S X

The standard deviation of X is


q
sd[ X ] = Var[ X ].

2 this should be taken with a grain of salt. After all, what exactly do we mean by a center or

a spread of a distribution?

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 10 of 15

The fundamental properties of the variance/standard deviation are given


in the following theorem3 :

Theorem 1.5.4. Suppose that X and Y are random variables and that α is a
constant. Then
1. Var[αX ] = α2 Var[ X ], and
2. if, additionally, X and X are independent, then

Var[ X + Y ] = Var[ X ] + Var[Y ].

Caveat: These properties are not the same as the properties of the ex-
pectation. First of all the constant comes out of the variance with a
square, and second, the variance of the sum is the sum of the indi-
vidual variances only if additional assumptions, such as the indepen-
dence between the two variables, are imposed.

Finally, here is a very useful alternative formula for the variance of a


random variable:

Proposition 1.5.5. Var[ X ] = E[ X 2 ] − (E[ X ])2 .

Let us compute expectations and variances/standard deviations for our


most important examples.

Example 1.5.6.

1. Bernoulli distribution. Let X ∼ B( p) be a Bernoulli random vari-


able with parameter p. Then (remember q is a shortcut for 1 − p)

E[ X ] = 0 × q + 1 × p = p.

Using (1.5.5), we get

Var[ X ] = E[ X 2 ] − (E[ X ])2 = 02 × q + 12 × p − p2


= p − p2 = p(1 − p) = pq,

and, so, sd[ X ] = pq.

3 we will talk about independence in detail in the next lecture. An intuitive understanding

should be fine for now.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 11 of 15

2. Binomial distribution. Moving on to the binomial, X ∼ b(n, p),


we could either use the formula (1.5.1) and try to evaluate the sum
n  
n k n−k
E[ X ] = ∑ k p q ,
k =0
k

or use some of the properties of the expectation of Theorem 1.5.2.


To do the latter, we remember that the distribution of a binomial is
the same as the distribution of a sum of n (independent) Bernoul-
lies. So if we write X = X1 + · · · + Xn , and each X1 . . . Xn has the
B( p) distribution, Theorem 1.5.2 yields
E[ X ] = E[ X1 ] + E[ X2 ] + · · · + E[ Xn ] = np. (1.5.3)
A similar simplification can be achieved in the computation of the
variance, too. While it was unimportant in that X1 , . . . , Xn are in-
dependent in (1.5.3), it is crucial for Theorem 1.5.4:
Var[ X ] = Var[ X1 ] + · · · + Var[ Xn ] = npq,

and, so, sd[ X ] = npq.
3. Geometric distribution. The trick from 2. above cannot be applied
to the geometric random variables. If nothing else, this is because
Theorem 1.5.2 can only be applied to a given (fixed, nonrandom)
number n of random variables. We can still use the definition (1.5.1)
and evaluate an infinite sum:

E[ X ] = ∑ kpqk .
k =0

Instead of doing that, let us proceed somewhat informally and note


that we can think of a geometric random variable as follows:
With probability p our first throw is a success and X =
0. With probability q our first throw is a failure and we
restart the experiment on the second throw, making sure
to add the first failure to the count.
Therefore,
E[ X ] = p × 0 + q × (1 + E[ X ]),
and, so, E[ X ] = q/p.
Similar reasoning can be applied to obtain

E[ X 2 ] = p × 0 + qE[(1 + X )2 ] = q + 2q E[ X ] + qE[ X 2 ]
= q + 2q2 /p + qE[ X 2 ],

which yields Var[ X ] = E[ X 2 ] − (E[ X ])2 = q/p2 and sd[ X ] = q/p.

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 12 of 15

4. Poisson distribution. We know that the Poisson distribution arises


as a limit of binomial distributions when n → ∞, p → 0 and
np ∼ λ. We can expect, therefore, that its expectation and vari-
ance should behave accordingly, i.e., for X ∼ P(λ), we have

E[ X ] = λ and Var[ X ] = λ. (1.5.4)

The reasoning behind Var[ X ] = λ uses the formula Var[ X ] = npq


when X ∼ b(n, p) and plugs in q ≡ 1, since q = 1 − p and p → 0. A
more rigorous way of showing that (1.5.4) is correct is to evaluate
the sums
∞ ∞
E[ X ] = ∑ kp X (k) = ∑ ke−λ λk /k! and
k =0 k =0
∞ ∞
E[ X 2 ] = ∑ k2 pX (k) = ∑ k2 e−λ λk /k!.
k =0 k =0

and use Proposition 1.5.5. The sums can be evaluated explicitly, but
since the focus of these notes is not on evaluation of infinite sums,
so we skip the details.

1.6 Problems
Problem 1.6.1. Two people are picked at random from a group of 50 and
given $10 each. After that, independently of what happened before, three
people are picked from the same group - one or more people could have
been picked both times - and given $10 each. What is the probability that at
least one person received $20?

Problem 1.6.2. A die is rolled 5 times; let the obtained numbers be given by
Y1 , . . . , Y5 . Use counting to compute the probability that
1. all Y1 , . . . , Y5 are even?
2. at most 4 of Y1 , . . . , Y5 are odd?
3. the values of Y1 , . . . , Y5 are all different from each other?

Problem 1.6.3. Identify the supports of the following random variables:


1. Y + 1, where Y ∼ B( p) (Bernoulli),
2. Y 2 , where Y ∼ b(n, p) (binomial),

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 13 of 15

3. Y − 5, where Y ∼ g( p) (geometric),
4. 2Y, where Y ∼ P(λ) (Poisson).

Problem 1.6.4. Let Y denote the number of tosses of a fair die until the first
6 is obtained (if we get a 6 on the first try, Y = 0). The support SY of Y is

(a) {0, 1, 2, 3, 4, . . . }
(b) {1, 2, 3, 4, 5, 6}
(c) { 16 , 16 , 16 , 61 , 16 }
5 2 5 3
(d) { 16 , 56 × 16 , × 16 , × 16 , . . . }
 
6 6

(e) none of the above

Problem 1.6.5. The probability that Janet makes a free throw is 0.6. What is
the probability that she will make at least 16 out of 23 (independent) throws?
Write down the answer as a sum - no need to evaluate it.

Problem 1.6.6. Three unbiased and independent coins are tossed. Let Y1 be
the total number of heads on the first two coins, and let Y be the random
variable which is equal to Y1 if the third coin comes up heads and −Y1 if it
comes up tails. Compute Var[Y ].

Problem 1.6.7. A die is thrown and a coin is tossed independently of it. Let
Y be the random variable which is equal to the number on the die in case the
coin comes up heads and twice the number on the die if it comes up tails.
1. What the support of SY of Y? What is its distribution (pmf)?
2. Compute E[Y ] and Var[Y ].

Problem 1.6.8. n people vote in a general election, with only two candidates
running. The vote of person i is denoted by Yi and it can take values 0 and 1,
depending which candidate they voted for (we encode one of them as 0 and
the other as 1). We assume that votes are independent of each other and that
each person votes for candidate 1 with probability p. If the total number of
votes for candidate 1 is denoted by Y, then

(a) Y is a geometric random variable


(b) Y 2 is a binomial random variable

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 14 of 15

(c) Y is uniform on {0, 1, . . . , n}


(d) Var[Y ] ≤ E[Y ]
(e) none of the above

Problem 1.6.9. A discrete random variable Y is said to have a discrete uni-


form distribution on {0, 1, 2, . . . , n}, denoted by Y ∼ u(n) if its distribution
table looks like this:
0 1 2 ... n
1 1 1 1 .
n +1 n +1 n +1 ... n +1

Compute the expectation and the variance of u(n). You may use the fol-
lowing identities: 1 + 2 + · · · + n = 12 n(n + 1) and 12 + 22 + · · · + n2 =
1
6 n ( n + 1)(2n + 1).

Problem 1.6.10. Let X be a Poisson random variable with parameter λ > 0.


Compute the following:

1. P[ X ≥ 3],
2. (*) E[ X 3 ]. Note: The sum you need to evaluate is quite difficult. If you don’t know
the trick, do not worry. If you know how to use symbolic-computation software such as
Mathematica, feel free to use it. We will learn how to do this using generating functions later
in the class.

Problem 1.6.11. Let X be a geometric random variable with parameter p ∈


(0, 1), i.e. X ∼ g( p), and let Y = 2−X . Write down the (first few entries in)
the distribution table of Y. Compute E[Y ] = E[2− X ].

Problem 1.6.12. Let Y1 and Y2 be uncorrelated discrete random variables such


that Var[2Y1 − Y2 ] = 17 and Var[Y1 + 2Y2 ] = 5. Compute Var[Y1 − Y2 ]. Note:
Y1 and Y2 are uncorrelated if E[(Y1 − E[Y1 ])(Y2 − E[Y2 ])] = 0.
(Hint: What is Var[αY2 + βY2 ] in terms of Var[Y1 ] and Var[Y2 ] when Y1
and Y2 are uncorrelated?)

Problem 1.6.13. Let Y1 and Y2 be uncorrelated random variables such that


sd[Y1 + Y2 ] = 5. Then sd[Y1 − Y2 ] =
√ √
(a) 1 (b) 2 (c) 3 (d) 5 (e) not enough information is given

Last Updated: September 25, 2019


Lecture 1: Discrete random variables 15 of 15

Problem 1.6.14. Let X be a discrete random variable with the support S X =


N, such that P[ X = n] = C n12 , for n ∈ N, where C is a constant chosen so
that ∑n P[ X = n] = 1. The distribution table of X is, therefore, given by

1 2 3 ...
.
C 112 C 212 C 312 ...

1. Show that E[ X ] does not exist.


2. Construct a distribution of similar random variable whose expectation
does exist, but the variance does not. (Hint: Use the same support N, but
tweak the probabilities so that the sum for E[ X ] converges, while the sum
for E[ X 2 ] does not.)

Problem 1.6.15. (*) Let Y be a discrete random variable such that SY ⊆ N0 .


By counting the same thing in two different ways, explain why

E [Y ] = ∑ P [Y ≥ n ] .
n ∈N

This is called the tail formula for the expectation.

Problem 1.6.16. (*) Bob and Alice alternate taking customer calls at a call
center, with Alice always taking the first call. The number of calls during a
day has a Poisson distribution with parameter λ > 0.

1. What is the probability that Bob will take the last call of the day (that
includes the case when there are 0 calls). (Hint: What is the Taylor series
for the function cosh( x ) = 12 (e x + e− x ) around x = 0?)
2. Who is more likely to take the last call? Alice or Bob? As above, if there
are no calls, we give the “last call” to Bob.

Problem 1.6.17 (*). A mail lady has l ∈ N letters in her bag when she starts
her shift and is scheduled to visit n ∈ N different households during her
round. If each letter is equally likely to be addressed to any one of the n
households, and the letters are delivered independently of each other, what
is the expected number of households that will receive at least one letter?
Note: It is quite possible that some households will receive more than 1 letter.

Last Updated: September 25, 2019

You might also like