0% found this document useful (0 votes)

35 views

Chapter 7 - Sum of Independent Random - 2016 - Introduction To Statistical Machi

Uploaded by

Robinson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Chapter 7 - Sum of Independent Random - 2016 - Introduction To Statistical Machi

Uploaded by

Robinson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CHAPTER

SUM OF INDEPENDENT
RANDOM VARIABLES
7
CHAPTER CONTENTS
Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Reproductive Property. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

In this chapter, the behavior of the sum of independent random variables is first
investigated. Then the limiting behavior of the mean of independent and identically
distributed (i.i.d.) samples when the number of samples tends to infinity is discussed.

7.1 CONVOLUTION
Let x and y be independent discrete variables, and z be their sum:
z = x + y.
Since x + y = z is satisfied when y = z − x, the probability of z can be computed by
summing the probability of x and z − x over all x. For example, let z be the sum of
the outcomes of two 6-sided dice, x and y. When z = 7, these dice take
(x, y) = (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1),
and summing up the probabilities of occurring these combinations gives the proba-
bility of z = 7.
The probability mass function of z, denoted by k(z), can be expressed as

k(z) = g(x)h(z − x),
x

where g(x) and h(y) are the probability mass functions of x and y, respectively. This
operation is called the convolution of x and y and denoted by x ∗ y. When x and y are
continuous, the probability density function of z = x + y, denoted by k(z), is given
similarly as

k(z) = g(x)h(z − x)dx,

where g(x) and h(y) are the probability density functions of x and y, respectively.

An Introduction to Statistical Machine Learning. DOI: 10.1016/B978-0-12-802121-7.00018-2

Copyright © 2016 by Elsevier Inc. All rights of reproduction in any form reserved. 73
74 CHAPTER 7 SUM OF INDEPENDENT RANDOM VARIABLES

7.2 REPRODUCTIVE PROPERTY

When the convolution of two probability distributions in the same family again yields
a probability distribution in the same family, that family of probability distributions is
said to be reproductive. For example, the normal distribution is reproductive, i.e., the
convolution of normal distributions N(µ x , σ 2x ) and N(µ y , σ y2 ) yields N(µ x + µ y , σ 2x +
σ y2 ).
When x and y are independent, the moment-generating function of their sum,
x + y, agrees with the product of their moment-generating functions:
Mx+y (t) = Mx (t)My (t).
Let x and y follow N(µ x , σ 2x ) and N(µ y , σ y2 ), respectively. As shown in Eq. (4.1), the
moment-generating function of normal distribution N(µ x , σ 2x ) is given by
σ 2x t 2
 
Mx (t) = exp µ x t + .
2
Thus, the moment-generating function of the sum, Mx+y (t), is given by
Mx+y (t) = Mx (t)My (t)
σ2 t 2 σ y2 t 2
 
= exp µ x t + x exp * µ y t + +
2 , 2 -
(σ 2x + σ y2 )t 2
= exp *(µ x + µ y )t + +.
, 2 -
Since this is the moment-generating function of N(µ x + µ y , σ 2x +σ y2 ), the reproductive
property of normal distributions is proved.
Similarly, computation of the moment-generating function of Mx+y (t) for inde-
pendent random variables x and y proves the reproductive properties for the binomial,
Poisson, negative binomial, gamma, and chi-squared distributions (see Table 7.1).
The Cauchy distribution does not have the moment-generating function, but the
computation of the characteristic function φ x (t) = Mi x (t) (see Section 2.4.3) shows
that the convolution of Ca(a x , b x ) and Ca(a y , by ) yields Ca(a x + a y , b x + by ).
On the other hand, the geometric distribution Ge(p) (which is equivalent to the
binomial distribution NB(1, p)) and the exponential distribution Exp(λ) (which is
equivalent to the gamma distribution Ga(1, λ)) do not have the reproductive properties
for p and λ.

7.3 LAW OF LARGE NUMBERS

Let x 1 , . . . , x n be random variables and f (x 1 , . . . , x n ) be their joint probability
mass/density function. If f (x 1 , . . . , x n ) can be represented by using a probability
mass/density function g(x) as
f (x 1 , . . . , x n ) = g(x 1 ) × · · · × g(x n ),
7.3 LAW OF LARGE NUMBERS 75

Table 7.1 Convolution

Distribution x y x+y
Normal N (µ x , σ 2x ) N (µ y , σ 2y ) N (µ x + µ y , σ 2x + σ 2y )
Binomial Bi(n x , p) Bi(n y , p) Bi(n x + n y , p)
Poisson Po(λ x ) Po(λ y ) Po(λ x + λ y )
Negative binomial NB(k x , p) NB(k y , p) NB(k x + k y , p)
Gamma Ga(α x , λ) Ga(α y , λ) Ga(α x + α y , λ)
Chi-squared χ 2 (n x) χ 2 (n y) χ 2 (n x + n y )
Cauchy Ca(a x , b x ) Ca(a y , b y ) Ca(a x + a y , b x + b y )

x 1 , . . . , x n are mutually independent and follow the same probability distribution.

Such x 1 , . . . , x n are said to be i.i.d. with probability density/mass function g(x) and
denoted by
i.i.d.
x 1 , . . . , x n ∼ g(x).

When x 1 , . . . , x n are i.i.d. random variables having expectation µ and variance

σ 2 , the sample mean (Fig. 7.1),
n
1
x= xi ,
n i=1

satisfies
n
1
E[x] = E[x i ] = µ,
n i=1
n
1  σ2
V [x] = 2
V [x i ] = .
n i=1 n

This means that the average of n samples has the same expectation as the original
single sample, while the variance is reduced by factor 1/n. Thus, if the number
of samples tends to infinity, the variance vanishes and thus the sample average x
converges to the true expectation µ.
The weak law of large numbers asserts this fact more precisely. When the original
distribution has expectation µ, the characteristic function φ x (t) of the average of
independent samples can be expressed by using the characteristic function φ x (t) of a
single sample x as
  t  n  t n
φ x (t) = φ x = 1 + iµ + · · · .
n n
76 CHAPTER 7 SUM OF INDEPENDENT RANDOM VARIABLES

The mean of samples x 1 , . . . , x n usually refers to the arithmetic mean, but other means
such as the geometric mean and the harmonic mean are also often used:
n
1
Arithmetic mean: xi ,
n
i=1
1
n n

Geometric mean: *. x i +/ ,
, i=1 -
1
Harmonic mean: 1  1
.
n
n i=1 x i

For example, suppose that the weight increased by the factors 2%, 12%, and 4%
in the last three years, respectively. Then the average increase rate is not given
by the arithmetic mean (0.02 + 0.12 + 0.04)/3 = 0.06, but the geometric mean
1
(1.02 × 1.12 × 1.04) 3 ≈ 1.0591. When climbing up a mountain at 2 kilometer per
hour and going back at 6 kilometer per hour, the mean velocity is not given by the
arithmetic mean (2 + 6)/2 = 4 but by the harmonic mean 2d/( d2 + d6 ) = 3 for distance
d, according to the formula “velocity = distance/time.” When x 1 , . . . , x n > 0, the
arithmetic, geometric, and harmonic means satisfy
1
n n n
1  1
x i ≥ *. x i +/ ≥ 1  1
,
n n
i=1 , i=1 - n i=1 x i

and the equality holds if and only if x 1 = · · · = x n . The generalized mean is defined
for p , 0 as
1
n p
*. 1

p
x i +/ .
n
, i=1 -
The generalized mean is reduced to the arithmetic mean when p = 1, the geometric
mean when p → 0, and the harmonic mean when p = −1. The maximum of x 1 , . . . , x n
is given when p → +∞, and the minimum of x 1 , . . . , x n is given when p → −∞.
When p = 2, it is called the root mean square.

FIGURE 7.1
Arithmetic mean, geometric mean, and harmonic mean.

Then Eq. (3.5) shows that the limit n → ∞ of the above equation yields

lim φ x (t) = eit µ .

n→∞
7.4 CENTRAL LIMIT THEOREM 77

(a) Standard normal distribution N (0, 1) (b) Standard Cauchy distribution Ca(0, 1)

FIGURE 7.2
Law of large numbers.

Since eit µ is the characteristic function of a constant µ,

lim Pr(|x − µ| < ε) = 1

n→∞

holds for any ε > 0. This is the weak law of large numbers and x is said to
converge in probability to µ. If the original distribution has the variance, its proof
is straightforward by considering the limit n → ∞ of Chebyshev’s inequality (8.4)
(see Section 8.2.2).
On the other hand, the strong law of large numbers asserts
 
Pr lim x = µ = 1,
n→∞

and x is said to almost surely converge to µ. The almost sure convergence is a more
direct and stronger concept than the convergence in probability.
Fig. 7.2 exhibits the behavior of the sample average x = n1 i=1
n
x i when
x 1 , . . . , x n are i.i.d. with the standard normal distribution N(0, 1) or the standard
Cauchy distribution Ca(0, 1). The graphs show that, for the normal distribution which
possesses the expectation, the increase of n yields the convergence of the sample
average x to the true expectation 0. On the other hand, for the Cauchy distribution
which does not have the expectation, the sample average x does not converge even if
n is increased.

7.4 CENTRAL LIMIT THEOREM

As explained in Section 7.2, the average of independent normal samples follows the
normal distribution. If the samples follow other distributions, which distribution does
78 CHAPTER 7 SUM OF INDEPENDENT RANDOM VARIABLES

(a) Continuous uniform distribution U(0, 1)

(b) Exponential distribution Exp(1)

(c) Distribution used in Fig. 19.11

FIGURE 7.3
Central limit theorem. The solid lines denote the normal densities.

the sample average follow? Fig. 7.3 exhibits the histograms of the sample averages
for the continuous uniform distribution U(0, 1), the exponential distribution Exp(1),
and the probability distribution used in Fig. 19.11, together with the normal densities
with the same expectation and variance. This shows that the histogram of the sample
average approaches the normal density as the number of samples, n, increases.
The central limit theorem asserts this fact more precisely: for standardized random
variable

x−µ
z= √ ,
σ/ n
7.4 CENTRAL LIMIT THEOREM 79

the following property holds:

 b
1
√ e−x /2 dx.
2
lim Pr(a ≤ z ≤ b) =
n→∞ a 2π
Since the right-hand side is the probability density function of the standard normal
distribution integrated from a to b, z is shown to follow the standard normal
distribution in the limit n → ∞. In this case, z is said to converge in law or
converge in distribution to the standard normal distribution. More informally, z is
said to asymptotically follow the normal distribution or z has asymptotic normality.
Intuitively, the central limit theorem shows that, for any distribution, as long as it has
the expectation µ and variance σ 2 , the sample average x approximately follows the
normal distribution with expectation µ and variance σ 2 /n when n is large.
Let us prove the central limit theorem by showing that the moment-generating
function of
x−µ
z= √
σ/ n
2 /2
is given by the moment-generating function of the standard normal distribution, e t .
Let
xi − µ
yi =
σ
and express z as
n n
1  xi − µ 1 
z= √ = √ yi .
n i=1 σ n i=1

Since yi has expectation 0 and variance 1, the moment-generating function of yi is

given by
1
My i (t) = 1 + t 2 + · · · .
2
This implies that the moment-generating function of z is given by
n  n
n  t2

 t
Mz (t) = My i / √ n (t) = My i √ = 1+ +··· .
n 2n

If the limit n → ∞ of the above equation is considered, Eq. (3.5) yields

2 /2
lim Mz (t) = e t ,
n→∞

which means that z follows the standard normal distribution.

Schwartz Value Survey
100% (6)
Schwartz Value Survey
6 pages
46R-11 - AACE International
67% (3)
46R-11 - AACE International
27 pages
Add Maths Sba
No ratings yet
Add Maths Sba
13 pages
Quality Control
No ratings yet
Quality Control
90 pages
MIT14 30s09 Lec17
No ratings yet
MIT14 30s09 Lec17
9 pages
Two Proofs of The Central Limit Theorem
No ratings yet
Two Proofs of The Central Limit Theorem
13 pages
CLT PDF
No ratings yet
CLT PDF
13 pages
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
100% (1)
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
14 pages
Basic Statistics and Probability For Econometrics Econ 270a
No ratings yet
Basic Statistics and Probability For Econometrics Econ 270a
18 pages
Random Variables
No ratings yet
Random Variables
8 pages
Lecture Notes 1 36-705 Brief Review of Basic Probability
No ratings yet
Lecture Notes 1 36-705 Brief Review of Basic Probability
7 pages
Moment Generating Function
No ratings yet
Moment Generating Function
20 pages
Proof_Central_Limit_Theorem
No ratings yet
Proof_Central_Limit_Theorem
4 pages
Diet of Random Variables
No ratings yet
Diet of Random Variables
8 pages
MATH 181 1 SEMESTER/AY 2018-2019: Frequently Used Continuous Random Variables
No ratings yet
MATH 181 1 SEMESTER/AY 2018-2019: Frequently Used Continuous Random Variables
8 pages
College Statistics
No ratings yet
College Statistics
244 pages
All Lectures 2018 Fall 201 A
No ratings yet
All Lectures 2018 Fall 201 A
100 pages
Normal Distribution
No ratings yet
Normal Distribution
4 pages
Mathematics Handbook
No ratings yet
Mathematics Handbook
11 pages
Bera 2 - Introduction to Statistics for Econometricians, Part II Apostila
No ratings yet
Bera 2 - Introduction to Statistics for Econometricians, Part II Apostila
114 pages
Fall 2018 Statistics 201A Aditya Guntuboyina
No ratings yet
Fall 2018 Statistics 201A Aditya Guntuboyina
101 pages
MIT18 S096F13 Lecnote3
No ratings yet
MIT18 S096F13 Lecnote3
7 pages
Random Variables: 1.1 Elementary Examples
No ratings yet
Random Variables: 1.1 Elementary Examples
14 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
115 pages
Mathematical Exploration Statistics
No ratings yet
Mathematical Exploration Statistics
9 pages
Introductory Probability and The Central Limit Theorem
No ratings yet
Introductory Probability and The Central Limit Theorem
11 pages
CLT Q&a
No ratings yet
CLT Q&a
13 pages
A First Course in Probability Notes
No ratings yet
A First Course in Probability Notes
103 pages
1.7.1 Moments and Moment Generating Functions: Chapter 1. Elements of Probability Distribution Theory
No ratings yet
1.7.1 Moments and Moment Generating Functions: Chapter 1. Elements of Probability Distribution Theory
8 pages
Fe Engineering Probability Statistics
No ratings yet
Fe Engineering Probability Statistics
9 pages
Random Processes: Version 2, ECE IIT, Kharagpur
No ratings yet
Random Processes: Version 2, ECE IIT, Kharagpur
8 pages
Econ 623 AsymptoticTheory 2023
No ratings yet
Econ 623 AsymptoticTheory 2023
74 pages
Prob 3160 CH 8
No ratings yet
Prob 3160 CH 8
10 pages
Stat Proof Book
No ratings yet
Stat Proof Book
381 pages
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
No ratings yet
Sampling Distributions of Statistics: Corresponds To Chapter 5 of Tamhaneand Dunlop
36 pages
Intro To Data Science Lecture 2
No ratings yet
Intro To Data Science Lecture 2
12 pages
Advance Statistics
No ratings yet
Advance Statistics
69 pages
Itc Notes
No ratings yet
Itc Notes
32 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
111 pages
STAT515 Lecture
No ratings yet
STAT515 Lecture
85 pages
Math Cheat Sheet
No ratings yet
Math Cheat Sheet
2 pages
week two note
No ratings yet
week two note
19 pages
Sum of Normally Distributed Random Variables
No ratings yet
Sum of Normally Distributed Random Variables
5 pages
1 Math Fundamentals: 1.1 Integrals, Factors and Techniques
No ratings yet
1 Math Fundamentals: 1.1 Integrals, Factors and Techniques
11 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
Probability and Statistics - 2
No ratings yet
Probability and Statistics - 2
72 pages
MA225 L3 Notes
No ratings yet
MA225 L3 Notes
40 pages
(Cambridge Tracts in Mathematics) H. Cramer - Random Variables and Probability Distributions (2004, Cambridge University Press) - Libgen - Li
No ratings yet
(Cambridge Tracts in Mathematics) H. Cramer - Random Variables and Probability Distributions (2004, Cambridge University Press) - Libgen - Li
133 pages
Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001
No ratings yet
Statistics Study Guide: Matthew Chesnes The London School of Economics September 22, 2001
22 pages
Part 2aa
No ratings yet
Part 2aa
89 pages
doc-cours_MathsV
No ratings yet
doc-cours_MathsV
69 pages
AS2024_11_18_CentralLimitTheorem
No ratings yet
AS2024_11_18_CentralLimitTheorem
16 pages
Course Notes
No ratings yet
Course Notes
111 pages
Convergence of Random Variables
No ratings yet
Convergence of Random Variables
11 pages
Msiii PDF
No ratings yet
Msiii PDF
118 pages
Review Notes - Probability
No ratings yet
Review Notes - Probability
16 pages
STAT Exercises
No ratings yet
STAT Exercises
258 pages
Central Limit Theorem Example
No ratings yet
Central Limit Theorem Example
5 pages
Statistics Revision
No ratings yet
Statistics Revision
7 pages
17 Notes MFML Probreview
No ratings yet
17 Notes MFML Probreview
19 pages
Ch1 Prob II NAU Spring23
No ratings yet
Ch1 Prob II NAU Spring23
17 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Financial Analysis in excel for professionals
No ratings yet
Financial Analysis in excel for professionals
5 pages
Educational Statistics
100% (1)
Educational Statistics
106 pages
Quiz 2 Formula Sheet
No ratings yet
Quiz 2 Formula Sheet
2 pages
Biostatistics Practical Answers - 2023
No ratings yet
Biostatistics Practical Answers - 2023
11 pages
Sample Size Determination
No ratings yet
Sample Size Determination
19 pages
Instant ebooks textbook (Ebook) The Chicago Guide to Writing about Numbers by Jane E. Miller ISBN 9780226526300, 9780226526317, 0226526305, 0226526313 download all chapters
100% (3)
Instant ebooks textbook (Ebook) The Chicago Guide to Writing about Numbers by Jane E. Miller ISBN 9780226526300, 9780226526317, 0226526305, 0226526313 download all chapters
82 pages
Lecture Five - Docx Measure of Dispersion
No ratings yet
Lecture Five - Docx Measure of Dispersion
9 pages
An23 Stat Ipuc Sec A&b
No ratings yet
An23 Stat Ipuc Sec A&b
22 pages
Statistics Assignment 1
No ratings yet
Statistics Assignment 1
2 pages
Healthy Minds: A Portrait of Mental Health Experiences in The European Unificationist Community
No ratings yet
Healthy Minds: A Portrait of Mental Health Experiences in The European Unificationist Community
47 pages
Part B: Descriptive Statistics - Case Study Part 1: Descriptive Statistics I. Central Tendency
No ratings yet
Part B: Descriptive Statistics - Case Study Part 1: Descriptive Statistics I. Central Tendency
8 pages
G11 2ND Sem Quarter1 Tos Statistics
No ratings yet
G11 2ND Sem Quarter1 Tos Statistics
2 pages
A. Chapter 9 (SD) (Hybrid)
No ratings yet
A. Chapter 9 (SD) (Hybrid)
186 pages
Mini-Test: Chapter 4 Student's Name:: A: B: C: D: F
No ratings yet
Mini-Test: Chapter 4 Student's Name:: A: B: C: D: F
2 pages
X
No ratings yet
X
4 pages
Unit 3 Ids Notes
No ratings yet
Unit 3 Ids Notes
31 pages
Statistics Management
No ratings yet
Statistics Management
10 pages
Module 1
No ratings yet
Module 1
85 pages
Research Methodology Lab File
No ratings yet
Research Methodology Lab File
92 pages
A Survey On Assertiveness Among Secondary School Students
100% (2)
A Survey On Assertiveness Among Secondary School Students
6 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
26 pages
Statistics and Probability
No ratings yet
Statistics and Probability
8 pages
Mean, Median, Mode
100% (1)
Mean, Median, Mode
7 pages
Math Form 3 Full Notes
No ratings yet
Math Form 3 Full Notes
170 pages
3RD Quarter Exam STAT and Prob
No ratings yet
3RD Quarter Exam STAT and Prob
9 pages
QBM101 Tutorial Module
No ratings yet
QBM101 Tutorial Module
8 pages

Chapter 7 - Sum of Independent Random - 2016 - Introduction To Statistical Machi

Uploaded by

Chapter 7 - Sum of Independent Random - 2016 - Introduction To Statistical Machi

Uploaded by

CHAPTER

An Introduction to Statistical Machine Learning. DOI: 10.1016/B978-0-12-802121-7.00018-2

7.2 REPRODUCTIVE PROPERTY

7.3 LAW OF LARGE NUMBERS

Table 7.1 Convolution

x 1 , . . . , x n are mutually independent and follow the same probability distribution.

When x 1 , . . . , x n are i.i.d. random variables having expectation µ and variance

lim φ x (t) = eit µ .

Since eit µ is the characteristic function of a constant µ,

lim Pr(|x − µ| < ε) = 1

7.4 CENTRAL LIMIT THEOREM

(a) Continuous uniform distribution U(0, 1)

(b) Exponential distribution Exp(1)

(c) Distribution used in Fig. 19.11

the following property holds:

Since yi has expectation 0 and variance 1, the moment-generating function of yi is

If the limit n → ∞ of the above equation is considered, Eq. (3.5) yields

which means that z follows the standard normal distribution.

You might also like