0% found this document useful (0 votes)
152 views

Probability 2 FPM

This document discusses various probability distributions that are used to model random variables. It introduces discrete and continuous random variables. It describes key concepts like the probability mass/density function, cumulative distribution function, expected value, variance, and several specific probability distributions such as the binomial, negative binomial, geometric, hypergeometric, and Poisson distributions. Examples are provided to illustrate how to calculate probabilities and parameter values for each distribution.

Uploaded by

Nikhil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views

Probability 2 FPM

This document discusses various probability distributions that are used to model random variables. It introduces discrete and continuous random variables. It describes key concepts like the probability mass/density function, cumulative distribution function, expected value, variance, and several specific probability distributions such as the binomial, negative binomial, geometric, hypergeometric, and Poisson distributions. Examples are provided to illustrate how to calculate probabilities and parameter values for each distribution.

Uploaded by

Nikhil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Preparatory Quantitative Methods

Tushar Tanwar
Random Variable
 Random variable
A random variable is a function from the sample space Ω to R
 Example:
 We extract 3 balls from a box containing an equal number of white and
black balls, and we put the ball back into the box after each extraction. How
many white balls can occur and what are the associated probabilities?
 Sample space:

 A random variable has a probability law


 a rule that assigns probabilities to the different values of the random
variable

 The function f defined by f(x) = P(X = x), is called frequency or probability


function.
Probability distribution

 If
  X is a random variable which can take the values x1, x2, ..., xn with the
probabilities f(x1), f(x2), ..., f(xn) then the set of ordered pairs (xi, f(xi)), i =
1, n is called the probability distribution of the random variable X.
 Example:
 I toss a fair coin three times. The random variable X gives the number
of heads recorded. Find the probability distribution.
 Find Probability Distribution of the Sum of Two Dice
 What is the probability that the sum will be at most 5? This is P(X 5).
 The probability that the sum is greater than 9 is ?
 Plot it graphically- Probability Histogram.
 Two socks are selected at random and removed in succession from a drawer
containing five brown books and three green socks. Find the probability
distribution of random variable W where W is the number of brown socks
selected.
Discrete and Continuous Random Variables
 A discrete random variable can assume at most a countable number of
values.
 Example: No of calls you make in a day
 A continuous random variable may take on any value in an interval of
numbers .
 The values of continuous random variables can be measured (at least in
theory) to any degree of accuracy. Example- temperature
Probability distribution of a discrete random variable

 The
  probability distribution function of a discrete random variable X must
satisfy the following two conditions.
1 . P(x) 0 for all values x
2.
 P(X =x) is the probability of the event that the random variable equals x.
 Check whether the function given by f(x)=(x+2)/25 for x=1,2,3,4,5 can be a
probability distribution.
Cumulative Distribution Function

 The
  cumulative distribution function, F(x), of a discrete random variable
X is F(x)=P(Xx)= for all real x.
 F is a non decreasing function from 0 to 1
 Find cumulative Distribution function of the Sum of Two Dice
 F(1) = P(X ≤ 1) = 0, F(2) = P(X ≤ 2) = f(2) = 1 /36 , F(3)=? …
 F(2.6) = P(X ≤ 2.6) = 1/ 36 (as x takes real values).
 So graph of the distribution function is given as

 Also, P(X>2)=1-F(2)
P(3
Example
 Find the distribution function of total number of heads obtained in four
tosses of a balanced coin.
Expected Value/Mean

 The
  mean of a probability distribution of a random variable is a measure of
the centrality of the probability distribution.
 The mean is a weighted average of the possible values of the random
variable—the weights being the probabilities.
 The expected value of a discrete random variable X is given as:
 for all x
 Find mean value for the below described random variable:
Expected Value of a Function

 The
  Expected Value of a Function of a Random Variable
 Let h(X) be a function of the discrete random variable X.
 for all x
 Monthly sales of a certain product, recorded to the nearest thousand, are
believed to follow the probability distribution given in table below. Suppose
that the company has a fixed monthly production cost of $8,000 and that
each item brings $2. Find the expected monthly profit from product sales.
 The expected value of a linear function of a random variable
E(aX + b) = aE(X) + b , where a and b are fixed numbers
Variance

 Variance
  and Standard Deviation of a Random Variable
 The variance of a random variable is the expected squared deviation of the
random variable from its mean
 for all x

 It is a measure of the dispersion of the possible values of the random


variable about the mean.
 The standard deviation of a random variable is
Example
 Find variance using both the definitions :
Example
 I toss a fair coin three times; X is the number of heads. What are the
expected value and variance of X?
Variance of a linear function

 Variance
  of a linear function of a random variable is V(aX+b)=a2V(X)=a22
(a and b are constants)
Sum of random variables
 The expected value of sum of several random variables is the sum of
individual expected values
E(a1X1+a2X2+………+akXk) = a1E(X1)+ a2E(X2) +………+akE(Xk)

 If X1,X2,……Xk are mutually independent, then the variance of their sum


is the sum of their individual variance. That is,
V(a1X1+a2X2+………+akXk) = a12V(X1)+ a22V(X2) +………+ak2V(Xk)
Chebyshev’s Theorem

 The
  standard deviation is useful in obtaining bounds on the possible values
of the random variable with certain probability.
 Chebyshev’s Theorem says that for any number k greater than 1, the
probability that the value of a given random variable will be within k
standard deviations of the mean is at least 1- 1/k2.
 For a random variable X with mean and standard deviation and for k>1,P(|
X- |<k )1-1/k2
Example
 For k=2 , the value of random variable will be within 2 standard deviations
from mean with at least .75 probability.
 In other terms, at least 75% of observations lie within 2 standard deviation
of mean.
 A sample of size n = 50 has mean x=28 and standard deviation s = 3.
Without knowing anything else about the sample, what can be said about
the number of observations that lie in the interval (22,34)?
 Check chebyshev’s theorem for
Bernoulli Random Variable
 Bernoulli Random Variable

 If the outcome of a trial can only be either a success or a failure, then the
trial is a Bernoulli trial.
 The number of successes X in one Bernoulli trial, which can be 1 or 0, is a
Bernoulli random variable.
 x =1 is called “success” and x =0 is called “failure.”
 Example: Tossing a coin.
Binomial Random Variable
 The Binomial Random Variable X~B(n,p)
 Consider n number of identically and independently distributed Bernoulli
random variables X 1, X2, . . . , Xn
 Such a sequence of identically and independently distributed Bernoulli
variables is called a Bernoulli process
 An X that counts the number of successes in many independent, identical
Bernoulli trials is called a binomial random variable.
 Example: Number of correct guesses at 30 true-false questions when you
randomly guess all answers.
Binomial Random Variable
 Conditions for a Binomial Random Variable
 1. The trials must be Bernoulli trials in that the outcomes can only be either
success or failure.
 2. The outcomes of the trials must be independent.
 3. The probability of success in each trial must be constant.

 Find the probability of getting five heads and seven tails in 12 flips of a
balanced coin.
 Find the probability that seven of ten persons will recover from a tropical
disease if we can assume independence and the probability is 0.8 that any
one of them will recover from the disease.
Negative Binomial Distribution
 Negative Binomial Distribution X~ NB(s, p)
 The number of successes is held constant , and the number of trials is
random.
 Let s denote the exact number of successes desired and p the probability of
success in each trial. Let X denote the number of trials made until the
desired number of successes is achieved.
 The number of trials made in this scenario is said to follow a negative
binomial distribution.
 If the probability is 0.40 that a child exposed to a contagious disease will
catch it , what is the probability that the tenth child exposed to the disease
will be the third to catch it?
The Geometric Distribution
 The Geometric Distribution
 In a negative binomial distribution, the number of desired successes s can
be any number.
 It is special case of the negative binomial distribution where s = 1

 Let X be the (random) number of Bernoulli trials, each having p probability


of success, required to achieve just one success.
 If the probability is 0.75 that an applicant for a driver’s license will pass the
road test on any given try, what is the probability that an applicant will
finally pass the test on the fourth try.
Hypergeometric distribution
 When a pool of size N contains S successes and (N-S) failures, and a
random sample of size n is drawn from the pool, the number of successes X
in the sample follows Hypergeometric distribution.
 The probability of success p, is neither constant nor independent from trial
to trial.(sampling without replacement)
 Therefore, X does not follow a binomial distribution, but follows what is
called a hypergeometric distribution.
Example
 As a part of an air pollution survey, an inspector decides to examine the
exhaust of six of a company’s 24 trucks. If four of the company trucks emit
excessive amounts of pollutants , what is the probability that none of them
will be included in the inspectors sample?
Poisson distribution
 Let us assume a machine produces 20,000 pins and has 110,000 chance of
producing a perfect one.
 Variable - number of perfect pins produced?
 Use the binomial distribution with n = 20,000 and p = 110,000
 What is the expected number of pins produced? (n*p=2)
 The binomial formula for P(X =x) can be approximated as Poisson
distribution.
 We need only mean (n.p = constant) for Poisson distribution
Poisson distribution
 It gives the probability of a given number of events occurring in a fixed
interval of time/ or space if these events occur with a known average rate and
independently of the time since the last event.
 So, if we count the number of times a rare event occurs during a fixed interval,
then that number would follow a Poisson distribution.

 Records show that the probability is 0.00005 that a car will have a flat tire while
crossing a certain bridge. Find probability that among 10,000 cars crossing this
bridge,
 (a) exactly two will have flat tire
 (b) at most two will have a flat tire.
Example
 It is known from the past experience that in a certain plant there are on the
average of 4 industrial accidents per month. Find the probability that in a
given year will be less that 3 accidents? 
Continuous random variable
 A continuous random variable is a random variable that can take on any
value in an interval of numbers. Example: X- time taken
Probability density function

 The
  probabilities associated with a continuous random variable X are
determined by the probability density function of the random variable.
The function, denoted f(x), has the following properties.
1 . f(x) 0 for all x.
2. The total area under the entire curve of f(x) is equal to 1.
i.e.
Example

 The
  probability that X will be between two numbers a and b is equal to the
area under f(x) between a and b.
 When the sample space is continuous, the probability of any single given
value is zero
 For a continuous random variable, nonzero probabilities are associated only
with intervals of numbers.
 P(a
 Example:
 If X has the probability density

 Find k and P(0.5


CDF

 The
  cumulative distribution function of a continuous random variable
F(x) = P(X x) which area under f(x) between the smallest possible value of
X (often - ) and point x.
 Equivalently, FX(x) =

 F is non decreasing function from 0 to 1.


 Also, fX(x) = FX(x)

 P(a X b) = FX(b) FX(a) =


Example

 Find
  the distribution function of random variable X(earlier example) and
use it to evaluate P(0.5
 Find the probability density function for the random variable whose
distribution function is given by F(x)=

 Expectation
  E(X) =
 Variance Var(X) = E(X2) E(X)2
 Where E(X2) =
 Suppose fX(x)=
 Find cdf of X, Expectation and Variance of X
Moments

 Moments
  of a random variable
 rth moment about origin of a random variable
 Denoted as is the expected value of Xr

 First moment about origin is the mean of random variable


 rth moment about mean of a random variable
 Denoted as is the expected value of Xr

 Second moment about mean is the variance of random variable


Skewness & kurtosis

 Skewness/
  Lack of symmetry
 Coefficient of skewness is given as
 Kurtosis: measure of peakedness of a distribution
 Larger kurtosis- more peaked
 Relative kurtosis = Absolute kurtosis -3 ( in relation to normal distribution)
 Negative kurtosis means flatter distribution than normal- platykurtic.
 Positive kurtosis implies a more peaked distribution than normal-
leptokurtic.
Continuous distributions
 Uniform distribution
 X can model a randomly chosen point from the interval [a, b], where each
choice is equally likely.
The Exponential Distribution

 Relation
  to Poisson variable:
 We have events which occur randomly but at a constant average rate of λ
per unit time .The Poisson random variable, which is discrete, counts how
many events will occur in the next unit of time. The exponential random
variable, which is continuous, measures exactly how long from now it is
until the next event occurs.
 Suppose we arrive at the scene at any given time and wait till the event
occurs. The waiting time will then follow an exponential distribution. f(x)
= x
 Continuous form of geometric
 Suppose our waiting time was x. For the event (or success) to occur at time
x, every tiny duration t from time 0 to time x should be a failure and the
interval x to x + t must be a success. This is nothing but a geometric
distribution.
The Exponential Distribution
 The time gap between two successive arrivals to a waiting line, known as
the interarrival time, will be exponentially distributed. This information is
relevant to waiting line management.
 The time between two successive breakdowns of a machine will be
exponentially distributed. This information is relevant to maintenance
engineers.
 The life of a product that fails by accident rather than by wear-and-tear
follows an exponential distribution. Electronic components are good
examples. This information is relevant to warranty policies.
The Exponential Distribution

 Pdf
  of exponential for different parameters

 Memory less property of exponential


P(X > s + t | X > s) = P(X > t), for all s,t
 If X represents the life-time of a particle, this particle “dies” at constant rate
λ, independently of its age.
 The geometric distribution is memoryless: The number of attempts
necessary to win the lottery is independent of the past attempts.
 A particular brand of handheld computers fails following an exponential
distribution with a mean of 54.82 months. The company gives a warranty
for 6 months. What percentage of the computers will fail within the
warranty period?
Normal/ Gaussian distribution- Importance
 A lot of random variables occurring in practice/ real life can be
approximated to it.
 If a random variable is affected by many independent causes, and the effect
of each cause is not overwhelmingly large compared to other effects, then
the random variable will closely follow a normal distribution.
 The lengths of pins made by an automatic machine follows normal
distribution. The length of a pin is affected by many independent causes
such as vibrations, temperature, wear and tear on the machine, and raw
material properties.
 In sampling theory, many of the sample statistics are normally distributed.

 We
  say that a random variable has a normal distribution with parameters μ
and σ2 if its density function f is given by

 Properties of Normal distribution


 Bell shaped curve
 Pdf is symmetric around μ. (Mean=Median) , a (relative) kurtosis of 0
 If X1, X2…Xn are independent random variables that are normally
distributed, then random variable Q=a1X1+a2X2+………+akXk +b is normally
distributed with E(Q) = a1E(X1)+ a2E(X2) +………+akE(Xk)+ b
 and V(Q)== a12V(X1)+ a22V(X2) +………+ak2V(Xk)
 A cost accountant needs to forecast the unit cost of a product for next year.
He notes that each unit of the product requires 12 hours of labor and 5.8
pounds of raw material. In addition, each unit of the product is assigned an
overhead cost of $184.50. He estimates that the cost of an hour of labor
next year will be normally distributed with an expected value of $45.75 and
a standard deviation of $1.80; the cost of the raw material will be normally
distributed with an expected value of $62.35 and a standard deviation of
$2.52. Find the distribution of the unit cost of the product. Report its
expected value, variance, and standard deviation.
Standard normal distribution
 If μ = 0 and σ = 1 then and the distribution is known as a standard normal
distribution.(Z ~ N(0, 12)).

 The cdf of this distribution is often denoted by Φ.


 If X ∼ N(μ, σ2), then (X − μ)/ σ ∼ N(0, 1)
 If Z ∼ N(0, 12), then μ+ Z σ ∼ N(μ, σ2)
Standard Normal probabilities
 f

 The
  total area under the normal curve is equal to one.
 The area between to 0 and 0 to is 0.5(symmetry)
 Table area=F(z)-0.5
 The concentration of impurities in a semiconductor used in the production
of microprocessors for computers is a normally distributed random
variable with mean 127 parts per million and standard deviation 22. A
semiconductor is acceptable only if its concentration of impurities is below
150 parts per million. What proportion of the semiconductors are
acceptable for use?
 Fluctuations in the prices of precious metals such as gold have been
empirically shown to be well approximated by a normal distribution when
observed over short intervals of time. In May 1995, the daily price of gold
(1 troy ounce) was believed to have a mean of $383 and a standard
deviation of $12. A broker, working under these assumptions, wanted to
find the probability that the price of gold the next day would be between
$394 and $399 per troy ounce. In this eventuality, the broker had an order
from a client to sell the gold in the client’s portfolio. What is the probability
that the client’s gold will be sold the next day?
The Inverse Transformation
 Normal random variables are related to one another by the fact that the
probability that a normal random variable will be above (or below) its mean
a certain number of standard deviations is exactly equal to the probability
that any other normal random variable will be above (or below) its mean
the same number of (its) standard deviations.
 The probability that a normal random variable will be within a distance
of 1 standard deviation from its mean (on either side) is 0.6826
 The probability that a normal random variable will be within 2 standard
deviations of its mean is 0.9544
 The probability that a normal random variable will be within 3 standard
deviations of its mean is 0.9974.
Examples
 PALCO Industries, Inc., is a leading manufacturer of cutting and welding
products. One of the company’s products is an acetylene gas cylinder used
in welding. The amount of nitrogen gas in a cylinder is a normally
distributed random variable with mean 124 units of volume and standard
deviation 12. We want to find the amount of nitrogen x such that 10% of the
cylinders contain more nitrogen than this amount.
Examples
 The amount of fuel consumed by the engines of a jetliner on a flight
between two cities is a normally distributed random variable X with mean
=5.7 tons and standard deviation = 0.5. Carrying too much fuel is inefficient
as it slows the plane. If, however, too little fuel is loaded on the plane, an
emergency landing may be necessary. The airline would like to determine
the amount of fuel to load so that there will be a 0.99 probability that the
plane will arrive at its destination.
 Weekly sales of Campbell’s soup cans at a grocery store are believed to be
approximately normally distributed with mean 2,450 and standard deviation
400. The store management wants to find two values, symmetrically on
either side of the mean, such that there will be a 0.95 probability that sales
of soup cans during the week will be between the two values. Such
information is useful in determining levels of orders and stock.
 Thank You!

You might also like