randomnumbers-5
randomnumbers-5
2019
Shai Carmi
Outline
• Introduction
• Basic method
• Discrete random variables
• Continuous random variables
Why do we need the computer to generate random numbers?
• Example:
o A method estimates gene expression from sequencing reads
o We would like to test the method, but for any real data we don’t know the true levels
o We generate many synthetic datasets, for which we know the true levels, by randomly drawing
an expression levels, reads, and errors
Simulations to test complex models
• Example:
o An RNA molecule is traveling in the nucleus
o How long will it take it to exist? Nucleus
o Sometimes the theory is too complicated
Pore
o Instead, we simulate a random walk
RNA
Generating random numbers
• Traditional methods:
o Flip a coin
o Roll a die
o Pick a card from a deck
o Pick a ball from an urn
o Roulette
• Introduction
• Basic method
• Discrete random variables
• Continuous random variables
Linear Congruential Generator
• Provide a seed:
2 1 2 5
3 5 10 13
• Example:
4 3 6 9
o The sequence: 1,5,3,9,1,5,3,9,1,5,… 5 9 18 21
o Highly repetitive! 6 1 2 5
7 5 10 13
• The method generates integers in the 8 3 6 9
range 9 9 18 21
• We can’t draw more than numbers 10 1 2 5
without repeating ourselves
Large
n=200000
x = numeric(n)
y = 1
x[1]=y/54321
for (i in 2:n)
{
y = (12345*y)%%54321
x[i] = y/54321
}
hist(x,breaks=25)
Practice
• All common software packages have a method for a random number uniformly
distributed between [0,1]
o This means that will take any value between [0,1] with equal probability
o In R: runif
• The probability of each number (or a combination of numbers) should be the same
as any other number (or combination)
set.seed(1)
n = 10000
x = runif(n)
y = runif(n)
plot(x,y,pch=".”)
Is it possible to generate true random numbers?
• Introduction
• Basic method
• Discrete random variables
• Continuous random variables
Generating integers using
• Ceiling of : The smallest integer that’s larger than or equal to (rounding upwards)
• ceil[1.1]=2, ceil[4]=4, ceil[0.8]=1, ceil[5.23]=6
Generating integers within a range
• Example:
o
o
o
o
o
Random variables: reminder
• A random variable can take different numbers with certain probabilities based on
the results of an experiment
• The distribution of is the list of possible values along with their probabilities
• To draw numbers:
• First, draw a number uniformly distributed between [0,1]
• Then, set:
The inverse transformation method: why?
• Suppose is a random variable with equal probability to take any value in [0,1]
Possible values of
0 𝑎 𝑏 1
Possible values of
1 3 6
0 0.4 0.9 1
𝒙 𝟏 𝒙 𝟐 𝒙 𝟑𝒙 𝟒
0 𝑝 1 𝑝 1 +𝑝 2 𝑝 1 +𝑝 2 +𝑝 3 1
𝑝1 𝑝2 𝑝3 𝑝4
Example
Value of Condition on
1 1
2 3
3 6
Example: tossing coins
• Suppose we would like to draw a random number , equal to the number of “heads”
out of three coin tosses
• is binomial with and
Event Probability
TTT 0 1/8
TTH 1 1/8
THT 1 1/8
THH 2 1/8
HTT 1 1/8
HTH 2 1/8
HHT 2 1/8
HHH 3 1/8
The inverse transformation method
Possible values of U
1 0 1/8 0 1 2 3
2 1 3/8 0 1/8 1/2 7/8 1
3 2 3/8
1/8 3/8 3/8 1/8
4 3 1/8
The inverse transformation method
• Limited to specific cases where the problem has some structure, symmetry, etc.
Continuous random variables
• Examples:
o Standard uniform variables can have any value between [0,1]
o Normal variables can take any real value
𝑥
0 1
The cumulative distribution function
Standard uniform variable
𝑓 ( 𝑥) 𝐹 (𝑥)
1 1
= 𝑥
𝑥 𝑦 𝑥
0 1 0 1
• Intuition: 𝐹 (𝑥)
o The random number is a position on the -axis
o We find for which 𝑈 𝑓 ( 𝑥)
o The larger in a region, the more values of
are covered by
𝑥
Example: non-standard uniform variable
f(x)
• for
1/4
x
0 4
• Examples:
o The number of generations until a mutation happens
o The time until a protein binds to DNA
o The time until a phone call in a call center
o The time until a lightbulb stops working
o The time to cancer, heart attack, injury
At least as a null model
Properties of exponential random variables
• is memoryless:
o Interpretation: the waiting time is the same regardless of whether we start counting from time
zero or from any other time
How to draw an exponential random variable
• Rejection sampling
• Markov chain Monte Carlo (MCMC): Metropolis-Hastings algorithm
o For drawing points in high dimension
• Basic idea:
o We need to draw a number from , but there is no direct method
o We draw instead numbers from , from which sampling is easy
can be uniform, normal, etc.
o We then accept or reject the new sample depending on the relationship between and
o MCMC generates a sequence of correlated samples, which we need to “thin” to obtain
independent draws