0% found this document useful (0 votes)
23 views81 pages

RV and Distributions

Here are the steps to solve this binomial probability problem: * There are 12 multiple choice questions * Each question has 5 possible answers, with 1 correct answer * So the probability of getting any one question correct is 1/5 * Let's define: - n = number of trials = 12 questions - p = probability of success on each trial = 1/5 - x = number of successes (a) Probability of exactly 4 correct answers: - Use the binomial probability mass function: P(X=x) = (nCx) * p^x * (1-p)^(n-x) - P(X=4) = (12C4) * (
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views81 pages

RV and Distributions

Here are the steps to solve this binomial probability problem: * There are 12 multiple choice questions * Each question has 5 possible answers, with 1 correct answer * So the probability of getting any one question correct is 1/5 * Let's define: - n = number of trials = 12 questions - p = probability of success on each trial = 1/5 - x = number of successes (a) Probability of exactly 4 correct answers: - Use the binomial probability mass function: P(X=x) = (nCx) * p^x * (1-p)^(n-x) - P(X=4) = (12C4) * (
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 81

Probability & Random Variables

(basics)
Random Variables

• Random experiment: the outcome cannot be predicted with


certainty • Errors in the measuring process
• Fundamental unpredictability

• Statistics: model and analyze the outcomes


• Sample space S = set of all possible outcomes
• Die X = { 1, 2, 3, 4, 5, 6}

Discrete random variable

• Period of a pendulum

Continous random variable


Basic
basic Concepts
concepts (cont.)

• discrete vs. continuous probabilities


• discrete
– finite number of outcomes
• continuous
– outcomes vary along continuous scale
Continuous probabilities

.2 0.22

total area under curve = 1


p but

.1
the probability of any
single value = 0

0.00
 interested in the
-5 5
probability assoc. w/
0
intervals
Independent events
• one event has no influence on the outcome
of another event
• if events A & B are independent
then P(A&B) = P(A)*P(B)
• if P(A&B) = P(A)*P(B)
then events A & B are independent
• coin flipping
if P(H) = P(T) = .5 then
P(HTHTH) = P(HHHHH) =
.5*.5*.5*.5*.5 = .55 = .03
Mutually Exclusive Events

• mutually exclusive events are not


independent
• rather, the most dependent kinds of events
– if not heads, then tails
– joint probability of 2 mutually exclusive events
is 0
• P(A&B)=0
Basic
basic Concepts
concepts (cont.)

• if A and B are mutually exclusive events:


P(A or B) = P(A) + P(B)
ex., die roll: P(1 or 6) = 1/6 + 1/6 = .33
• possibility set:
sum of all possible outcomes
~A = anything other than A
P(A or ~A) = P(A) + P(~A) = 1
Conditional Probability
• concern the odds of one event occurring,
given that another event has occurred

• P(A|B)=Prob of A, given B
Bayes Theorem


P ( A B )
P( A | B) 
P (B )
P (B  A )
P (B | A ) 
P ( A)

P ( A | B )P (B )  P (B | A )P ( A )

P (B | A)P ( A)
P( A | B) 
P (B )
Bayes Theorem

P (B | A)P ( A)
P( A | B) 
P (B )

Since P ( B )   P ( B  Ai )   P ( B | Ai ) P ( Ai )
i i

P (B | A)P ( A)
P( A | B) 
 P (B | Ai )P ( Ai )
i
Bayes’ Theorem
An email message can pass through one of the two server routes
Probability of Error

Route % Server1 Server2 Server3 Server4


messages

1 30 0.01 0.015

2 70 0.02 0.003

1. What is the probability that a message will arrive without error?


2. If a message arrives in error, find the probability that it was sent
through Route1
Bayes’ Theorem
An email message can pass through one of the two server routes
Probability of Error
Route % Server1 Server2 Server3 Server4
messages
1 30 0.01 0.015
2 70 0.02 0.003

1. What is the probability that a message will arrive without error?


2. If a message arrives in error, find the probability that it was sent through Route1

Ans-1
P(R1) = 0.30; P(R2) = 0.70; Calculate P(Er/R1) = (0.01+0.015) = 0.025------(1)

P(Error) = (0.
30* 0.025)+( 0.70* 0.023) = 0.0236 =>

Ans 1 = 97.64% ie (1-0.0236)

Ans-2
P(Error/R1) = [P(R1) * P (Error/R1)]/ P(Error) = [(0.030* 0.025)/ 00236] = 0.6822
Assignment
The probability of the presence of an error in coding is
0.05.

If the probability of a tester detecting an error when the


error is present is 0.78; and the probability of
incorrectly detecting an error when the error is not
present is 0.06.

What is the probability that a code is tested as having an


error? What is the probability that a code tested as
having an error when the error is present?
NB

• Let’s say we’re testing for a rare disease, where 1% of the


population is infected. We have a highly sensitive and specific
test, which is not quite perfect:

• 99% of sick patients test positive.


• 99% of healthy patients test negative.

• Given that a patient tests positive, what is the probability that


the patient is actually sick?


NB
PDF/ PMF
PDF/ PMF
• Probability that an event occurs
– probability density function - continuous random variables or
– probability mass function - discrete random variables.

• To find the probability that a continuous random variable falls


in a particular interval of real numbers - calculate the
appropriate area under the curve of f(x) .

• Thus, evaluate the integral of f(x) over the interval of random


variables corresponding to the event of interest. This is
represented by
PDF/ PMF
Cumulative Distribution Function (CDF)

CDF F(x) is defined as the probability that the


random variable X assumes a value less than or
equal to a given x.

Calculated from the probability density function,


PDF Vs CDF
Axioms of Probability
Probability Distributions and
Probability Density Functions

Figure Probability determined from the area under f(x).


Probability Distributions and
Probability Density Functions

Definition
Probability Distributions and
Probability Density Functions

Figure Histogram approximates a probability density


function.
Probability Distributions and
Probability Density Functions
Example
Probability Distributions and
Probability Density Functions

Figure Probability density function for Example


Probability Distributions and
Probability Density Functions
Example (continued)
Cumulative Distribution Functions

Definition
Cumulative Distribution Functions
Example
Cumulative Distribution Functions

Figure Cumulative distribution function for Example.


DISTRIBUTIONS
Random Variables(RV)
• A RV is defined as a process or action whose outcome cannot
be predicted with certainty and would likely change when the
experiment is repeated.
• The variability in the outcomes might arise from many
sources: slight errors in measurements

• The sample space is the set of all outcomes from an


experiment. eg dice {1,2,3,4,5,6}.

• The outcomes from random experiments are often represented


by an uppercase variable such as X.

• This is called a RV, and its value is subject to the uncertainty


intrinsic to the experiment.
Random Variables(RV)

• Formally, a RV is a real-valued function defined


on the sample space.

• RV can be discrete or continuous.

o A discrete RV : values from a finite or countably infinite set of


numbers. (no of typographical errors on a page.

o A continuous RV : take on values from an interval of real


numbers. (the inter-arrival times of planes at a runway)
Random Variables(RV)
• An event is a subset of outcomes in the sample space. (tensile strength
of cement is in the range 40 to 50 kg/cm2.)

• Two events that cannot occur simultaneously or jointly are called


mutually exclusive events.

• Probability is a measure of the likelihood that some event will occur.

• Probabilities range between 0 and 1. A PDF of a RV describes the


probabilities associated with each possible value for the RV.

• Equal likelihood model (assign prob 1/n) and


• The relative frequency method (conduct the experiment n times and record the
outcome. The probability of event E is assigned by P(E) = f ⁄ n where f denotes the
number of experimental outcomes that satisfy event E.
RV
• Discrete RV
o Binomial
o Poisson
• Continuous distributions:
o uniform,
o normal,
o exponential,
o gamma,
o chi-square, the Weibull, the beta and the multivariate
normal etc.
The binomial distribution
• A discrete probability distribution.
• It describes the outcome of n independent trials in an
experiment. Each trial is assumed to have only two
outcomes, either success or failure.
• If the probability of a successful trial is p, then the
probability of having x successful outcomes in an
experiment of n independent trials is as follows.

• Mean E[X] = np and V[X] = np(1-p)


The binomial distribution

• Frequently used to model the number of


successes in a sample of size n drawn with
replacement from a population of size N.
The binomial distribution
• Suppose there are twelve multiple choice questions in an
English class quiz. Each question has five possible answers,
and only one of them is correct.

• Find the probability of having


(a) Exactly four answers correct
(b) four or less correct answers

if a student attempts to answer every question at random.


The binomial distribution
• Suppose there are twelve multiple choice questions in an English class quiz. Each question has five possible
answers, and only one of them is correct. Find the probability of having (a) Exactly 4 ans correct (b) four or
less correct answers if a student attempts to answer every question at random.
• Solution
Since only one out of five possible answers is correct, the probability of answering a question correctly by
random is 1/5=0.2.
We can find the probability of having exactly 4 correct answers by random attempts as follows.
• (4, size=12, prob=0.2)
sum(np.random.binomial(12, 0.2, 20000) == 4)/20000. # Exactly four answers correct
0.1329

• To find the probability of having four or less correct answers by random attempts, find with x = 0,…,4.
• (0, size=12, prob=0.2) +
(1, size=12, prob=0.2) +
(2, size=12, prob=0.2) +
(3, size=12, prob=0.2) +
(4, size=12, prob=0.2)
sum(np.random.binomial(12, 0.2, 20000) <= 4)/20000. # four or less correct

• 0.9274
The probability of four or less questions answered correctly by random in a twelve question multiple choice
quiz is 92.7%.
Poisson Distribution

• Limiting case of Binomial, where chance of success is very


small ( p -> 0); n being large and np being small finite
quantity…binomial fails to state the real picture => PD.

• Eg: No of printing mistakes in a book; defects in a length of


wire ie “ Law of improbable events”

• PD is appropriate for applications where events occur at points


in time/ space: Arrival at bank counter/ fuel station / arrival of
aircrafts at a runway
Poisson Distribution

• The Poisson distribution is the probability distribution of


independent event occurrences in an interval.

• If λ is the mean occurrence per interval, then the probability of


having x occurrences within a given interval is:

• E[X] = V[X] = lamda


Poisson Distribution
• The Poisson distribution is the probability distribution of independent event occurrences in an
interval. If λ is the mean occurrence per interval, then the probability of having x occurrences
within a given interval is:

Ex Problem
• If there are twelve cars crossing a bridge per minute on average, find the probability of having seventeen or
more cars crossing the bridge in a particular minute.
Solution
• The probability of having sixteen or less cars crossing the bridge in a particular minute is
0.89871
• Hence the probability of having seventeen or more cars crossing the bridge in a minute is in the upper tail of
the probability density function.
0.10129
Answer
• If there are twelve cars crossing a bridge per minute on average, the probability of having seventeen or more
cars crossing the bridge in a particular minute is 10.1%.
Expectation
• The mean or EV of a RV is defined using the PDF/PMF.
• A measure of central tendency of the distribution. If we observe many
values of the RV and take the average => expect that value to be close to
the mean.

• EXPECTED VALUE - DISCRETE RV

• VARIANCE - DISCRETE RV

 If a RV has a large variance, then an observed value of the RV is more likely to be far
from the mean μ.

 The SD is the square root of the variance.


Expectation

or
Show that
Moments of a RV
• Other expected values of interest in statistics - moments of a RV.
• The expectation of powers of the RV.
Skewness
• The uniform and the normal distribution are examples of
symmetric distributions.
• The gamma and the exponential are examples of skewed or
asymmetric distributions.
• The 3rd central moment - a measure of asymmetry or
skewness in the distribution.

coefficient of skewness,

Distributions skewed to the left - negative coefficient of skewness,


Distributions skewed to the right - positive Value &
for symmetric distributions - Zero.

However, a coefficient of skewness equal to zero does not mean that the distribution must be
symmetric.
Symmetric vs. Skewed Data
• Median, mean and mode of symmetric, positively and negatively
skewed data

• In a unimodal frequency curve with perfect symmetric data


distribution, the mean, median, and mode are all at the same center
value, as shown in Figure

• Data in most real applications are not symmetric.


• They may instead be either positively skewed, where the mode
occurs at a value that is smaller than the median(Figure), or
• negatively skewed, where the mode occurs at a value greater
than the median (Figure).

October 21, 2023 49


Kurtosis
• Kurtosis measures a different type of departure from normality -
indicating the extent of the peak (or the degree of flatness near its
center) in a distribution.
• The coefficient of kurtosis :

• If the distribution is normal, then this ratio is equal to 3.


 > 3 => more values in the neighborhood of the mean (is more peaked than the
normal distribution).

 < 3 => curve is flatter than the normal.

 Sometimes the coefficient of excess kurtosis used as a measure of kurtosis.


Continuous Random Variables
• A continuous random variable X takes all values in an
interval of numbers.
Not countable

• The probability distribution of a continuous r.v. X is


described by a density curve.

• The probability of any event is the area under the density


curve and above the values of X that make up the event.
Mean and Variance of a Continuous
Random Variable
Definition
Mean and Variance of a Continuous
Random Variable
Expected Value of a Function of a Continuous
Random Variable
Continuous Uniform Random Variable

Definition

1/ (b-a), a≤x ≤b
f(x) =
0, otherwise
Continuous Uniform Random Variable

Figure Continuous uniform probability density function.


Continuous Uniform Random Variable

Mean and Variance


Continuous Uniform Random Variable

Example
Continuous Uniform Random Variable

Figure Probability for Example.


Mean and Variance of a Continuous
Random Variable
Example
Normal Distribution
Definition
Normal Distribution

Figure Normal probability density functions for selected


values of the parameters  and 2.
Six Sigma
• For any normal RV
– One sigma covers 68.27%
– Two sigma covers 95.45% and
– Six sigma process is one in which 99.99966%
of all opportunities to produce some feature of
a part are statistically expected to be free of
defects (3.4 defective features per million
opportunities).
– Motorola set a goal of "six sigma" for all of its
manufacturing.
Ex – Normal distribution

• Current in a strip of wire is assumed to


follow a normal distribution with mean 10
mA and variance 4 (mA)2.

• Find the probability that the measurement


of current will exceed 13 mA

The continuous random variable has the Normal distribution if the pdf is:
Normal Distribution

Definition : Standard Normal


Normal Distribution
Example

Appendix Table II

Figure Standard normal probability density function.


Normal Distribution
Standardizing
Normal Distribution
Example
Normal Distribution

Figure Standardizing a normal random variable.


Normal Distribution
To Calculate Probability
Normal Distribution

Example
Normal Distribution
Example (continued)
Normal Distribution
Example (continued)

Figure Determining the value of x to meet a specified probability.


Exponential Distribution
Definition
• model the amount of time until a specific event occurs or to model
the time between independent events.
• ex
• the time until the computer locks up,
• the time between arrivals of telephone calls, or
• the time until a part fails.

λ is the average arrival rate of those events

1- exp(-λx), x≥0
F(x) =
0, elsewhere
Cdf
Exponential Distribution

• => the probability that the object will operate for time s+t, given it
has already operated for time s, is simply the probability that it
operates for time t.

When the exponential is used to represent inter-arrival times, then


the parameter λ is a rate with units of arrivals per time period.

When the exponential is used to model the time until a failure


occurs, then λ is the failure rate.
Exponential Distribution
• The time between arrivals of vehicles at an intersection
follows an exponential distribution with a mean of 12
seconds. What is the probability that the time between
arrivals is 10 seconds or less?

• Given the average inter-arrival time, so λ = 1 ⁄ 12 . The


required probability is
Exponential Distribution

Our starting point for observing the system does not matter.

• An interesting property of an exponential random variable is


the lack of memory property.

In Example , suppose that there are no vehicles arriving from 10:00 to


10:15 AM; the probability that there are vehicles arriving in the next 10
secs is still 0.57

Because we have already been waiting for 15 minutes, we feel that we are
“due.” …ie, we expect the probability of a vehicle arriving in the next 10
secs should be greater than 0.57.
Exponential Distribution

Figure Lack of memory property of an Exponential distribution.


Exponential Distribution

Example
Exponential Distribution
Example (continued)
Assignment

• Suppose the mean checkout time of a supermarket cashier is three


minutes. Find the probability of a customer checkout being completed
by the cashier in less than two minutes.(0.48658)

• Assume that the test scores of a college entrance exam fits a normal
distribution. Furthermore, the mean test score is 72, and the standard
deviation is 15.2. What is the percentage of students scoring 84 or
more in the exam? (0.21492)

You might also like