Statistics And Probalility
Statistics And Probalility
Assignment
STT071
Mr. Charly Bongabong
AlfonG. Rusiana
Date: January 31, 2025
STT071
Alfon Rusiana
January 31, 2025
Assignment 1: Bernoulli Distribution
Scenario:
This line starts with a 1 cm indent. In a spam email detection
system,each incoming email has a 0.6 probability of being classified as
spam and 0.4 probability of being non-spam.
Tasks:
1. Simulate 500 email classifications using the Bernoulli distribution
where 1 =Spam and 0 = Not Spam.
Answer
R code:
# Set seed for reproducibility
set.seed(123)
# Parameters
n <- 500 # Number of emails
p_spam <- 0.6 # Probability of an email being classified as spam
Visualization
A bar plot of the classification results (Spam vs Not Spam) will display
the counts of each classification.
where:
n is the number of trials (3 attempts),
k is the number of successes (2 successful logins),
p is the probability of success on each trial (0.75 probability of
success).
In our case, we have: n = 3, k = 2, p = 0.75
3
First, we compute the binomial coefficient( 2 ):
Next, we compute the probability P (X = 2):
where:
• n = 3 is the number of trials (login attempts),
• p = 0.75 is the probability of success on each attempt,
• k is the number of successes (successful logins),
• The number of trials is 5000.
To simulate the 5000 trials, we use the binomial distribution in R. The R
code is as follows:
R Code:
# Set seed for reproducibility
set.seed(123)
# Parameters
n <- 3 # Number of attempts
p <- 0.75 # Probability of success
trials <- 5000 # Number of trials
Where:
n = 3 is the number of trials (login attempts),
k = 2 is the number of successes we are interested in,
p = 0.75 is the probability of success on a single trial.
Substituting the values into the formula:
Thus, the theoretical probability of exactly 2 successes is 0.421875.
Simulated Probability
Next, we will simulate 5000 trials of 3 login attempts with a probability
ofsuccess of 0.75. We will then compute the probability of exactly 2
successes based on the simulation results.
The R Code:
# Parameters
n <- 3 # Number of attempts
p <- 0.75 # Probability of success
trials <- 5000 # Number of trials
5. Interpretation:
Answer:
Significance of Matching:
If the simulated and theoretical probabilities are close, it confirms
that the simulation is functioning correctly, and the number of
trials is sufficient to produce accurate results.
The matching probabilities suggest that the simulated results are
approximating the true theoretical distribution. If the match is
poor, it might indicate that the number of trials is insufficient or
that the simulation setup is flawed.
Tasks:
1. Compute the probability of exactly 6 vehicles arriving in one
minute using the Poisson formula.
Answer:
Given:
Answer:
R Code:
# Parameters
head(vehicle_arrivals, 10)
hist(vehicle_arrivals,
ylab = "Frequency",
col = "lightblue",
border = "black")
Explanation of R Code:
Answer:
We are interested in computing the probability of exactly 6
vehicles arriving in a 1-minute interval. To do this, we will
compute both the theoretical and simulated probabilities.
Theoretical Probability
We will now simulate the vehicle arrival process 5000 times and
compute theprobability of exactly 6 vehicles arriving in those 5000
trials.
R Code for Simulation
The following R code simulates 5000 trials of vehicle arrivals using
the Poisson distribution and calculates the simulated probability of
getting exactly 6 vehicles:
R Code:
# Parameters
Answer:
R Code:
# Parameters
hist(vehicle_arrivals,
ylab = "Frequency",
col = "lightblue",
border = "black")
Explanation of R Code: -
Visual Output:
Answer:
Answer: