0% found this document useful (0 votes)
3 views

Tutorial 2 - Questions.

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Tutorial 2 - Questions.

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

ICT583

7 Mar 2023

ICT583 Data Science Applications


Tutorial 2
Mathematical Preliminaries

Scenarios
Many instances of binomial distributions can be found in real life.
- For example, if a new drug is introduced to cure a disease, it either cures the disease
(it’s successful) or it doesn’t cure the disease (it’s a failure).
- If you purchase a lottery ticket, you’re either going to win money, or you aren’t.
Basically, anything you can think of that can only be a success or a failure can be represented by
a binomial distribution.
A binomial distribution can be thought of as simply the probability of a SUCCESS or FAILURE
outcome in an experiment or survey that is repeated multiple times.

Maths
X ~ Binomial(n, p)
- For binomial distribution of the outcomes, n = number of observations, p = probability
Question: it is a discrete probability distribution or continues one? – discrete

Coding
In these exercises, you will practice using rbinom() function to generate random “flips” that are
either “heads” (1) or “tails” (0), to simulate random data
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/Binomial.html
rbinom(n, size, prob)
generates required number of random values of given probability from a given sample.
n - number of observations
size - number of trials
prob - probability of success of each trial.

1
ICT583
7 Mar 2023

Part One: Simulating coin flips


1.1 Flipping a coin in R
# you will simulate 10 coin flips, each with a 30% chance of coming up “heads”:
- 10 coin flips = 10 observations
- Flipping a coin
- 30% probability of head
rbinom(10, 1, .3)

Flipping multiple coins in R


# Generate 100 occurrences of flipping 10 coins, each with 30% probability
- 100 observations
- Flipping 10 coins,
- 30% chance of getting head
- rbinom(100, 10, .3)

how about increasing the number of observations, as well as the number of trials in each
observation?

rbinom(100, 3, .5) %>% hist


rbinom(100, 6, .5) %>% hist

rbinom(1000, 6, .5) %>% hist


rbinom(1000, 9, .5) %>% hist

rbinom(10000, 9, .5) %>% hist


rbinom(10000, 12, .5) %>% hist

f (event) = prob
rbinom(10000, 12, .5) %>% hist(freq = F)

2
ICT583
7 Mar 2023

1.2 Calculating density of a binomial


by hands
calculate the probability that 2 are heads, out of 10 trials, using the binomial probability mass
function (binomial PMF) formula
- n = number of trials = 10
- k = number of desired outcomes = 2
- p = probability = 0.3

First, compute the binomial coefficient


10!/ (8! 2!)
= (10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2 × 1) / (8 × 7 × 6 × 5 × 4 × 3 × 2 × 1) × (2 × 1)
= (10 * 9) / (2 * 1)
= 90 / 2
= 45
Then calculate the probability when k = 2
P(X = 2)
= 45 * (0.3 ^ 2) * (0.7 ^ 8) = 0.2334744

Coding
dbinom(x, size, prob)
gives the probability density distribution at each point.
x - vector of numbers (specify where you want to evaluate the binomial density)
If you flip 10 coins each with a 30% probability of coming up heads, what is the probability
exactly 2 of them are heads?
# Calculate the probability that 2 are heads using dbinom
dbinom(2, 10, .3)
plot(1:10, dbinom(1:10, size=10, prob=.3), type='h')

3
ICT583
7 Mar 2023

# Confirm your answer with a simulation using rbinom

For example, you will observe 100 times, the random deviates are
r = rbinom(100, 10, .3)

# which of the results exactly have 2 heads?


R <- r == 2

# to compute the proportion of these logical results,


mean(R)

# how about increasing the number of observations?

# what do you observe?

# we know the chance of head is 0.5, so, what is the probability of getting 2 heads given 10
trials per observation?
dbinom(2, 10, .5)

# how about 8 heads?


dbinom(8, 10, .5)

4
ICT583
7 Mar 2023

1.3 Calculating cumulative density of a binomial


pbinom(x, size, prob)
gives the cumulative probability of an event.

Scenario
If you flip ten coins that each have a 30% probability of heads, what is the probability at least
five are heads?

# Calculate the probability that at least five coins are heads


# we know the cumulative density of less than five heads is
r = pbinom(4, 10, .3)

# cumulative density curve


lapply(1:10, function(x) pbinom(x, 10, .3)) %>% unlist%>% plot
plot(1:10, pbinom(1:10, 10, .3), type="h")

# the cumulative density of five heads or more will be


1-r

# Confirm your answer with a simulation using rbinom, with 10000 observations
mean( rbinom(10000, 10, .3) >= 5 )

Try to simulate 100, 1000, 10000, 100000 observations.

Which is closest to the exact answer?

5
ICT583
7 Mar 2023

1.4 Expected values and variance for binomial distribution


Expected values
- e.g., we expect the chance of getting head for flipping a coin for unlimited number of
times will be 0.5
Calculate the expected value using the exact formula
Expect value = n * p
- n = number of trials
- p = probability
# What is the expected value of a binomial distribution where 1 coin is flipped, having 50%
chance of head?
1 * 0.5
# What is the expected value of a binomial distribution where 25 coins are flipped, each having a
30% chance of heads?
25 * .3 = 7.5
# Confirm with a simulation using rbinom, assuming 10000 observations
mean(rbinom(10000, 1, 0.5))
mean(rbinom(10000, 25, 0.3))

Variance
What is the variance of a binomial distribution where n coins are flipped, each having a p chance
of heads?
Var= n * p * (1-p)
When n = 1, p = .5, var = 0.25, SD = sqrt(.25)
When n = 25, p = .3, var = 5.25 , SD = 2.291288
# Confirm with a simulation using rbinom
r = rbinom(10000, 25, 0.3)
var(r)
sd(r)

6
ICT583
7 Mar 2023

Part Two Probability of compound events


If events A and B are independent, and
- A has a 40% chance of happening, and
- event B has a 20% chance of happening,

by hands
what is the probability they will both happen?
Joint probability: P(A ⋂ B) = P(A) * P(B) # intersection
= .4 * .2
what is the probability either A or B will come up heads?
Union probability: P(A ⋃ B) # union
= P(A) + P(B) – P(A ⋂ B)
= .4+.2 - .4 * .2

Coding
Assuming 100000 observations done, one trial,
A <- rbinom(100000, 1, .4)
B <- rbinom(100000, 1, .2)
a = mean(A)
b = mean(B)

j = a * b
u = a + b – j

# or
mean(A & B)
mean(A | B)

7
ICT583
7 Mar 2023

Part Three: Normal distribution


- For continuous variable
Suppose you flipped 1000 coins, each with a 20% chance of being heads.
What would be the mean and variance of the binomial distribution?
# Mean
=n*p
= 1000 * 0.2 = 200
# Variance
= n * p * (1-p)
= 1000 * 0.2 * 0.8 = 160

3.1 Simulating from binomial and normal


rnorm(n, mean, sd)
mean - mean value of the sample data. It's default value is zero.
sd - standard deviation. It's default value is 1.

When a random variable X is normally distributed with mean mu and standard deviation sigma.
# Draw a random sample of 100,000 from the Binomial(1000, .2) distribution
b <- rbinom(100000, 1000, .2)
plot(hist(b, breaks=30))
hist(b, breaks=50, main = "my binomial dist")

# Draw a random sample of 100,000 from the normal approximation


g <- rnorm(100000, 200, sqrt(160))
plot(hist(g, breaks=30))
hist(g, breaks=50, main = "Gaussian dist")

8
ICT583
7 Mar 2023

# probability density
g <- rnorm(100000, 0, sqrt(1))
plot(hist(g, breaks=30))
hist(g, breaks=50, main = "Gaussian dist", freq = F)

9
ICT583
7 Mar 2023

3.2 Comparing cumulative density of the binomial


pnorm(x, mean, sd)
gives the probability of a normally distributed random number to be less that the value of a
given number. It is also called "Cumulative Distribution Function".

# Simulations from the normal and binomial distributions


b <- rbinom(100000, 1000, .2)
g <- rnorm(100000, 200, sqrt(160))

# Use binom_sample to estimate the probability of <= 190 heads


mean(b <= 190)

# Use normal_sample to estimate the probability of <= 190 heads


mean(g <= 190)

# Calculate the probability of <= 190 heads with pbinom


pbinom(190, 1000, .2)

# Calculate the probability of <= 190 heads with pnorm


pnorm(190, 200, sqrt(160))

10
ICT583
7 Mar 2023

Expected value and variance for random variables

# dice
dice = 1:6
p = 1/6
EV = sum(dice)*p
var = map(dice, function(x) p * ( x - EV)^2 ) %>% unlist %>% sum
sd = sqrt(var)

# blood type
A couple has a 25% (p) chance of a having a child with type O
blood. What is the chance that three (X) of their five (n) kids
have type O blood?

dbinom(3, 5, .25)
p= map(0:5, function(x) dbinom(x, 5, .25))
EV = map2(0:5, p, function(x, y) x*y ) %>% unlist %>% sum
var= pmap(
list(
as.list(0:5),
EV,
p
) ,
\(x,y,z) z * (x-y)^2
) %>% unlist %>%sum

n = 5
p = .25
1-p = .75

5*.25
5*.25*.75

11

You might also like