0% found this document useful (0 votes)
32 views

Introduction and Some Basics

1) Stochastic models involve randomness or chance, in contrast to deterministic models which always produce the same output given the same inputs. 2) Probability theory provides the mathematical tools to study stochastic models. It began with gambling problems and was later formalized. 3) Random variables are numerical quantities that depend on random events. Their behavior can be described by probability distributions like the probability mass function for discrete variables or the probability density function for continuous variables.

Uploaded by

Potato Chan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Introduction and Some Basics

1) Stochastic models involve randomness or chance, in contrast to deterministic models which always produce the same output given the same inputs. 2) Probability theory provides the mathematical tools to study stochastic models. It began with gambling problems and was later formalized. 3) Random variables are numerical quantities that depend on random events. Their behavior can be described by probability distributions like the probability mass function for discrete variables or the probability density function for continuous variables.

Uploaded by

Potato Chan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

STAT3007: Introduction to

Stochastic Processes
Introduction and Some Basics

Dr. John Wright

1
Introduction and Some Basics
• The word “stochastic” comes from the Greek
word for “to guess”
– Stochastic means random or chance
• A stochastic model is in contrast to a
deterministic model
– For a deterministic model, if the “experiment” is
repeated using the same model with the same initial
conditions, you will have the same result every time
– This is not necessarily true for stochastic models.
Randomness will affect the results.

2
Introduction and Some Basics
• Example – Population Growth
– A (very simple) deterministic model for the growth of
a population comes from Malthus (1798), which states
that the population size at time 𝑡, 𝑃(𝑡), is governed by
the equation
𝑑𝑃
• = 𝑘𝑃, which means 𝑃 𝑡 = 𝑃0 𝑒 𝑘𝑡 , where 𝑃0 is the initial
𝑑𝑡
population size.
– A stochastic version of this would be a Langevin
𝑑𝑃
Equation: = 𝑘𝑃 + 𝜉, where 𝜉 is a random variable.
𝑑𝑡
In this model, 𝑃 𝑡 is no longer determined purely
from 𝑘 and 𝑃0
3
Introduction and Some Basics
• Why add 𝜉 into the Malthus model?
– To allow for random effects which might affect the
population size
– More generally, to allow for greater uncertainty in
the model
• Example
– How would you model a coin toss?
– Don’t the laws of physics (which we know) already
describe perfectly (and deterministically) the
motion and behaviour of a coin in motion?
4
Introduction and Some Basics
• So we often want randomness in our models
• And if we want to describe and investigate
these models with precision, we need
mathematics that can handle randomness
– This area of mathematics is Probability
• The study of Probability began in 16th century
Europe
– Focused more or less entirely on gambling (see
Question 1 of Exercises for Week 1)

5
Introduction and Some Basics
– As time went on, various mathematical greats like
Bernoulli, Euler, Gauss, Laplace worked on probability
problems
– But really it wasn’t until the early 20th century that
probability theory was formalized and given a rigorous
basis, by Kolmogorov (we’ll hear more from him later
in the course)
• We won’t concern ourselves with the underlying,
‘deep’ mathematics of probability (for that, read
about measure theory)
• But let’s remind ourselves of some probability
fundamentals we will use again and again
6
Introduction and Some Basics
• We begin with some set notation
– Let 𝐴 and 𝐵 be events. Then
• The union of 𝐴 and 𝐵, denoted 𝐴 ∪ 𝐵, is the event that
at least one of 𝐴 or 𝐵 occurs
• The intersection of 𝐴 and 𝐵, denoted 𝐴 ∩ 𝐵, is the
event that both 𝐴 and 𝐵 occur
• We can extend these to finite and countable numbers
of events, i.e. 𝐴1 ∪ 𝐴2 ∪ ⋯ =∪∞ 𝑖=1 𝐴𝑖 denotes the
event that at least one of the 𝐴1 , 𝐴2 , … occurs and 𝐴1 ∩
𝐴2 ∩ ⋯ =∩∞ 𝑖=1 𝐴𝑖 is the event that they all occur

7
Introduction and Some Basics
• The impossible event, that cannot occur, is denoted by ∅
• The certain event, that must occur, is denoted by Ω. It is the
set of all possible outcomes.
• Two events 𝐴 and 𝐵 are called disjoint if they cannot both
occur. In set notation, 𝐴 and 𝐵 are disjoint if 𝐴 ∩ 𝐵 = ∅
• The complement of event 𝐴, denoted 𝐴𝑐 , is the event that 𝐴
does not occur. Thus 𝐴 and 𝐴𝑐 are disjoint, and 𝐴 ∪ 𝐴𝑐 = Ω
– Now we are familiar with notation for events, we can
talk about the probabilities of these events occurring

8
Introduction and Some Basics
– The probability of the certain event occurring must be one.
Hence Pr Ω = 1.
– The probability of the impossible event occurring must be
zero. Hence Pr ∅ = 0
– For any event 𝐴, we must have 0 ≤ Pr 𝐴 ≤ 1
– If events 𝐴 and 𝐵 are disjoint, the probability that at least
one occurs must be the sum of their probabilities. Hence if
𝐴 ∩ 𝐵 = ∅, then Pr 𝐴 ∪ 𝐵 = Pr 𝐴 + Pr 𝐵
– We can generalise this a bit further to have an Addition
Law. For events 𝐴1 , 𝐴2 , … such that each pair 𝐴𝑖 , 𝐴𝑗 , 𝑖 ≠ 𝑗
is disjoint, then Pr ∪∞
𝑖=1 𝑖𝐴 = σ ∞
𝑖=1 Pr 𝐴𝑖

9
Introduction and Some Basics
– From the Addition Law follows The Law of Total
Probability
• Let 𝐴1 , A2 , … be disjoint events such that Ω =∪∞𝑖=1 𝐴𝑖 ,
i.e. one, and only one, of the events 𝐴1 , 𝐴2 , … will
occur.
• Then, for any event 𝐵, we have Pr 𝐵 =
σ∞𝑖=1 Pr(𝐴𝑖 ∩ 𝐵)
– We will use The Law of Total Probability and its
variations a lot in this course. It allows to work
out probabilities for 𝐵, using other probabilities
Pr 𝐴𝑖 ∩ 𝐵 which might be easier to find.

10
Introduction and Some Basics
• Lastly, we define the crucial idea of independence.
• We say events 𝐴 and 𝐵 are independent if Pr(𝐴 ∩
𝐵) = Pr 𝐴 × Pr 𝐵 .
• More generally, events 𝐴1 , 𝐴2 , … are independent if
Pr 𝐴𝑖1 ∩ 𝐴𝑖2 ∩ ⋯ ∩ 𝐴𝑖𝑛 = Pr 𝐴𝑖1 Pr 𝐴𝑖2 ⋯ Pr 𝐴𝑖𝑛 for
every finite set of distinct indices 𝑖1 , 𝑖2 , … , 𝑖𝑛 .
• Example – one throw of a balanced tetrahedron (four-
sided) die
– Possible events are 1, 2, 3, 4. Thus Ω = {1, 2, 3, 4}.
(Notation: curly brackets {} represent a set containing the
elements inside the brackets}

11
Introduction and Some Basics
– Addition Law: Pr 𝐸𝑣𝑒𝑛 𝑛𝑢𝑚𝑏𝑒𝑟 = Pr 2 𝑜𝑟 4 =
1 1 1
Pr 2 + Pr 4 = + =
4 4 2
– Independence: Consider events ‘1 or 2’ and ‘1 or
1
3’. Then Pr 1 𝑜𝑟 2 ∩ 1 𝑜𝑟 3 = Pr 1 = , so
4
the events are independent.
– What about the events ‘1 or 2’, ‘1 or 3’ and ‘2 or
3’? Are these independent?

12
Introduction and Some Basics
• Now we describe
random variables
– A random variable 𝑋 is a
numerical variable that
depends on a random
event
– (notation: we will denote
random variables by
capital letters only, e.g.
𝑋 or 𝑌. Lower case
letters will denote real
numbers, e.g. 𝑥 or 𝑦)

13
Introduction and Some Basics
• We would like to (probabilistically) describe the
behaviour of random variables (r.v.s)
• Consider the event that r.v. 𝑋 takes a value less
than or equal to some real number 𝑥
– i.e. 𝑋 ≤ 𝑥
• The probability of this event, Pr 𝑋 ≤ 𝑥 , for all
possible values of 𝑥, (i.e. −∞ < 𝑥 < ∞ ) defines
the distribution function of r.v. 𝑋
– Usually, we denote 𝐹 𝑥 = Pr 𝑋 ≤ 𝑥 , −∞ < 𝑥 < ∞

14
Introduction and Some Basics
• There two types of r.v.s: discrete and continuous
• A r.v. 𝑋 is discrete if there is a finite or countable
set of values 𝑥1 , 𝑥2 , … such that Pr 𝑋 = 𝑥𝑖 =
𝑎𝑖 > 0 for each 𝑖 and σ𝑖 𝑎𝑖 = 1
– E.g. 𝑋 is the number shown after one throw of a
dodecahedron die
– E.g. 𝑋 is the pair of numbers shown are one throw of
a pair of pentagonal trapezohedron dice
– E.g. 𝑋 is the number of lung cancer deaths in Hong
Kong in 2015

15
Introduction and Some Basics
• For a discrete r.v., each 𝑥𝑖 has a non-zero
probability of occurring
– The function which tells us what is the probability
of r.v. 𝑋 taking value 𝑥𝑖 is called the probability
mass function (p.m.f.)
– That is, the p.m.f. for 𝑋 is function 𝑝(𝑥), such that
𝑝 𝑥𝑖 = 𝑎𝑖 for 𝑖 = 1, 2, …
– Thus the distribution function and p.m.f. of a
discrete r.v. are related by 𝐹 𝑥 = σ𝑥𝑖 ≤𝑥 𝑝 𝑥𝑖

16
Introduction and Some Basics
• A r.v. 𝑋 is continuous if Pr 𝑋 = 𝑥 = 0 for every value of 𝑥
– E.g. 𝑋 is the height of STAT3007 student
– E.g. 𝑋 is the time it takes a STAT3007 student to finish the first
Problem Sheet
• If there is a non-negative function 𝑓 𝑥 , defined on −∞ <
𝑥 < ∞ such that
𝑏
– Pr 𝑎 < 𝑋 ≤ 𝑏 = ‫ 𝑦𝑑 𝑦 𝑓 𝑎׬‬for −∞ < 𝑎 < 𝑏 < ∞,
• then 𝑓 𝑥 is called the probability density function (p.d.f.)
of 𝑋
• Thus the distribution function and p.d.f. of 𝑋 are related by
𝑥
𝐹 𝑥 = ‫׬‬−∞ 𝑓 𝑦 𝑑𝑦
• Moreover, if 𝐹(𝑥) is differentiable (it usually is in this
𝑑
course), then 𝐹 𝑥 = 𝑓(𝑥) for −∞ < 𝑥 < ∞
𝑑𝑥

17
Introduction and Some Basics
• Be careful though
– p.d.f. 𝑓(𝑥) does not mean 𝑓 𝑥 = Pr 𝑋 = 𝑥 .
(Remember that for continuous r.v.s, we have
Pr 𝑋 = 𝑥 = 0)
• A very informal statement that relates a p.d.f.
to a probability is the following
Pr 𝑥 < 𝑋 < 𝑥 + 𝑑𝑥 = 𝐹 𝑥 + 𝑑𝑥 − 𝐹 𝑥
– = 𝑑𝐹 𝑥
= 𝑓 𝑥 𝑑𝑥

18
Introduction and Some Basics
• From the p.m.f. of a discrete r.v. or the p.d.f. of a
continuous r.v., we define expectations
• Let 𝑋 be discrete, with p.m.f. 𝑝(𝑥). Let 𝑌 be a function
of 𝑋, e.g. 𝑌 = 𝑔(𝑋) (thus 𝑌 is also a r.v.). The expected
value of 𝑌 (equivalently, the expected value of 𝑔 𝑋 is
given by
𝐸 𝑌 =𝐸 𝑔 𝑋 = σ∞
𝑖=1 𝑔 𝑥𝑖 Pr 𝑋 = 𝑥𝑖

= σ∞
𝑖=1 𝑔 𝑥𝑖 𝑝 𝑥𝑖
• Let 𝑋 be continuous, with p.d.f. 𝑓 𝑥 . Let 𝑌 = 𝑔(𝑋),
again. This time, the expected value of 𝑌 is given by

– 𝐸 𝑌 =𝐸 𝑔 𝑋 = ‫׬‬−∞ 𝑔 𝑥 𝑓 𝑥 𝑑𝑥

19
Introduction and Some Basics
• Sometimes we will consider not just single r.v.s by themselves, but
pairs or even 𝑛-tuples of r.v.s
– In this case, we need to talk about their joint distributions
• Consider a pair of r.v.s, (𝑋, 𝑌). Their joint distribution function is
the function 𝐹 such that
– 𝐹 𝑥, 𝑦 = Pr 𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦 for any 𝑥, 𝑦
• We can ‘back out’ the distribution of 𝑋 from the joint distribution 𝐹
by considering what happens to 𝐹 𝑥, 𝑦 as 𝑦 → ∞
– The function 𝐹𝑋 𝑥 = lim 𝐹(𝑥, 𝑦) is a distribution function, and is
𝑦→∞
called the marginal distribution function of 𝑋
• Of course, the marginal distribution function of 𝑌 is given by
𝐹𝑌 𝑦 = lim 𝐹 𝑥, 𝑦
𝑥→∞

20
Introduction and Some Basics
• The joint distribution function is said to have a
joint probability density function (j.p.d.f.) 𝑓 if
𝑥 𝑦
– 𝐹 𝑥, 𝑦 = ‫׬‬−∞ ‫׬‬−∞ 𝑓 𝜉, 𝜂 𝑑𝜂 𝑑𝜉
• From the j.p.d.f. we can ‘back out’ the marginal
density function for 𝑋 using

– 𝑓𝑋 𝑥 = ‫׬‬−∞ 𝑓 𝑥, 𝑦 𝑑𝑦
• Similarly, the marginal density function for 𝑌 is
found from

– 𝑓𝑌 𝑦 = ‫׬‬−∞ 𝑓(𝑥, 𝑦) 𝑑𝑥

21
Introduction and Some Basics
• We use marginal distribution functions to
define independence for a pair of random
variables
– If 𝐹 𝑥, 𝑦 = 𝐹𝑋 𝑥 × 𝐹𝑌 𝑦 for each 𝑥, 𝑦, then r.v.s
𝑋 and 𝑌 are independent
– Furthermore, if 𝑋 and 𝑌 are independent with
j.p.d.f. 𝑓(𝑥, 𝑦), then we have 𝑓 𝑥, 𝑦 =
𝑓𝑋 𝑥 𝑓𝑌 𝑦 , i.e. the joint density is the product of
the marginal densities.

22
Introduction and Some Basics
• Some important discrete distributions
– The Bernoulli Distribution
• r.v. 𝑋 has the Bernoulli Distribution with parameter 𝑝 if 𝑋 can
only take two values, 0 or 1, with probability (1 − 𝑝) and 𝑝
respectively
𝑝 𝑥 =𝑝 𝑖𝑓 𝑥 = 1
• Thus the p.m.f. is given by
= 1 − 𝑝 𝑖𝑓 𝑥 = 0
• 𝐸 𝑋 = 𝑝, 𝑉𝑎𝑟 𝑋 = 𝑝 1 − 𝑝
– Define an indicator function for an event 𝐴 by
𝟏 𝐴 = 𝟏𝐴 = 1 𝑖𝑓 𝐴 𝑜𝑐𝑐𝑢𝑟𝑠
= 0 𝑖𝑓 𝐴𝑐 𝑜𝑐𝑐𝑢𝑟𝑠
– Then 1𝐴 is a Bernoulli r.v. (what is parameter 𝑝 in this
case?)
23
Introduction and Some Basics
– The Binomial Distribution
• r.v. 𝑋 has the binomial distribution with parameters 𝑛
and 𝑝 if it is the sum of 𝑛 independent Bernoulli r.v.s,
each with parameter 𝑝
• That is, 𝑋 = σ𝑛𝑖=1 𝐵𝑖 , where 𝐵1 , 𝐵2 , … , 𝐵𝑛 are
independently and identically distributed with the
Bernoulli distribution with parameter 𝑝
𝑛!
• The p.m.f. for 𝑋 is 𝑝 𝑘 = 𝑝𝑘 1−𝑝 𝑛−𝑘 for
𝑘! 𝑛−𝑘 !
𝑘 = 0, 1, … , 𝑛
• 𝐸 𝑋 = 𝑛𝑝 and 𝑉𝑎𝑟 𝑋 = 𝑛𝑝 1 − 𝑝

24
Introduction and Some Basics
– The Poisson Distribution
• R.v. 𝑋 has the Poisson distribution with parameter 𝜆 if it
𝜆𝑘 𝑒 −𝜆
has p.m.f. 𝑝 𝑘 = for 𝑘 = 0, 1, …
𝑘!
• We will see later that the Poisson distribution can be
viewed as the limit of the Binomial distribution as 𝑛 →
∞ while 𝑛𝑝 = 𝜆 is held constant
• This explains why the Poisson distribution occurs
regularly in nature, through the so-called Law of Rare
Events
• 𝐸 𝑋 = 𝜆, 𝑉𝑎𝑟 𝑋 = 𝜆

25
Introduction and Some Basics
– The Exponential Distribution
• The most important distribution for this course. It is
deeply embedded into the processes we will
investigate. We will look at the special properties of
the Exponential distribution soon. For now, here is a
brief introduction
• R.v. 𝑋 has an Exponential distribution with parameter
𝑓 𝑥 = 𝜆𝑒 −𝜆𝑥 𝑓𝑜𝑟 𝑥 ≥ 0
𝜆 > 0 if it has p.d.f.
=0 𝑓𝑜𝑟 𝑥 < 0
1 1
•𝐸𝑋 = , 𝑉𝑎𝑟 𝑋 =
𝜆 𝜆2

26

You might also like