5 Random Variables
5 Random Variables
k
i=1
• We can present the probability mass function in a graphical
format
Example
• Suppose we conduct an experiment by tossing 2 fair coins. If denotes the
number of heads appear on the top. The random variable, , can take one
of the values, and . The probabilities are
We can derive the expected value from the distribution of 𝑌 the same way as we derive the mean
or average 𝑦¯ or 𝜇 of a list. For example, the average of the list (1,0,8,6,6,1,6) of 𝑛 = 7 number is
1+ 0+ 8+ 6+ 6+ 1+ 6 1 2 3 1
7 = 0× + 1× + 6× + 8× = 4
7 7 7 7
Example
• If random variable can take two possible values: and , with the
probabilities and Then
•
• Where . The weighted average of and would be
a number between and . The larger , the closer is to ;
and the larger , the closer is to
Variance
• If you predict the value of a random variable using its expected value
you will be off by the random amount which is known as
deviation
• If you want to consider the size of the deviation, you either need to
consider the absolute value or the square of . In algebra, it is easier
to use the square values than the absolute values, you need to consider
2 , then take the square root to get the value in the units same
as
2 2 2
2 2 2 because
2 2 2 2
Example
• Let be a discrete random variable with probability function as
given below 𝑦
i 𝑓(𝑦 )
i
0 0.20
1 0.15
2 0.25
3 0.35
4 0.05
• Expected Value
•
𝑦i 𝑓(𝑦i)
0 0.20
Example 1
2
0.15
0.25
3 0.35
4 0.05
• Variance
• Variance can be calculated in two way
2 2
•
2 2 2
• Second way is
• 2 2 2 2 2 2
2
•
Binomial
Distribution
Binomial Distribution
• We need to find a formula for
finding the probability of getting
successes in independent
trials.
• For this, we consider a tree
diagram for
• Each path down the steps
represents the possible outcomes
of the first trials.
• The th node in the th trial
represents successes in
trials.
Binomial Distribution
• The expression in each node denotes
the probabilities of success (denoted
by ) and failure (denoted by
) on each trial.
• The expression shows the sum of
probabilities of all paths leading to the
node.
• For example in row , the probabilities
of successes in
can be expressed by
3 3 2
2 3
Binomial Distribution
• The second term on the right-hand side 2 denotes the success from 3
trials ( .
• The factor arises because there are three ways to get one success in three trials.
.
• It also represents the three possible ways to reach the first node in the row
.
right times,
• From trialsthat
if weis want
corresponding
to achieveto successes,
successesweandshould
straight down
move down to times
that is corresponding to failures.
the
• The probability of every path of successes in the trials k n – k .
is successes in the trials is
• So, the probability mass function of all the paths of
k n–k
Binomial Distribution
• Where as nk denotes number of paths. It is called . The term
n
k
is given by the formula
2 7
9 1 5 36 × 57
= = 0.279
2 6 6 69
𝑝 k𝑞 n–k = =
2 2 2
n 1
• Probability of 𝑘 heads in 𝑛 fair coin tosses = k × where as 0 ≤ 𝑘
2n
≤ 𝑛
Properties of Binomial Distribution
• There are independent and identical trials for a Bernoulli
(success- failure) experiment.
• There are two possible outcomes of every experiment: Success or
failure
• The probability of success (denoted by ) remains constant for
each trial
• is a random variable that denotes the number of “successes”
observed during trials.
Mean and Standard Deviation
• Like any other probability distributions, a binomial probability
distribution also has a mean and standard deviation, .
Mean and Standard Deviation
• Example: A company that is producing turf grass monitors the quality of grass by taking a sample
on 25 seeds on a regular interval. The germination rate of the seed is consistent at 85%. Find the
average number of seeds that will germinate in the sample of 25 seeds
• 𝜇 = 𝑛𝑝 = 25 × 0.85 = 21.25
• 𝜎= 25 × 0.85 × 0.15 = 1.785
• Using Python
>>> from scipy.stats import binom
>>> n, p = 25, 0.85
>>> binom.mean(n,p)
21.25
>>> binom.std(n,p)
1.7853571071357126
Binomial Probability Distribution
•We can plot the probability distribution whereas
the distribution is tending towards left skewness
>>> import numpy as np
>>> from scipy.stats import binom
>>> import matplotlib.pyplot as plt
>>> x = np.arange(0,26)
>>> n, p = 25, 0.85
>>> fig, ax = plt.subplots(1, 1)
>>> ax.plot(x,binom.pmf(x,n,p),lw=5,alpha=0.5)
>>> ax.vlines(x,0,binom.pmf(x,n,p),lw=5,alpha=0.5)
>>> plt.title('Binomial Distribution n=25, p=0.85')
>>> plt.show()
Binomial Probability Distribution
• The bimonial distributions have roughly the same bell
shape irrespective of values of and . As and
vary, binomial distributions differ in their mean and
standard deviation
Distribution of number of success for n trials
• The bimonial(100, 𝑝) distribution is shown for 𝑝 = 10% 𝑡𝑜 90% by step of 10%. With
an increase in 𝑝 the distributions shifts to the right because the distribution is
centered around mean, 100𝑝, which increases with 𝑝. The distribution is symmetric
around 𝑝 = 0.5 and skewed around 𝑝 = 0 𝑎𝑛𝑑 𝑝 = 1. The spread of the distribution
increases with 𝑝 till 50% where it is maximum and then start reducing. This is justified
from the formula of the standard deviation 𝑛𝑝 1 − 𝑝 that increases with value
of 𝑝 till 50% and then reduces.
•
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt
fig, ((ax1, ax2, ax3), (ax4, ax5, ax6), (ax7, ax8, ax9)) = plt.subplots(3, 3)
x=np.arange(0,101)
ax1.plot(x,binom.pmf(x,n=100,p=0.1))
ax2.plot(x,binom.pmf(x,n=100,p=0.2))
ax3.plot(x,binom.pmf(x,n=100,p=0.3))
ax4.plot(x,binom.pmf(x,n=100,p=0.4))
ax5.plot(x,binom.pmf(x,n=100,p=0.5))
Distribution of number of success for n trials
ax6.plot(x,binom.pmf(x,n=100,p=0.6))
ax7.plot(x,binom.pmf(x,n=100,p=0.7))
ax8.plot(x,binom.pmf(x,n=100,p=0.8))
ax9.plot(x,binom.pmf(x,n=100,p=0.9))
ax1.set_title("binomial(n=100, p=0.1)",fontsize=10)
ax2.set_title("binomial(n=100, p=0.2)",fontsize=10)
ax3.set_title("binomial(n=100, p=0.3)",fontsize=10)
ax4.set_title("binomial(n=100, p=0.4)",fontsize=10)
ax5.set_title("binomial(n=100, p=0.5)",fontsize=10)
ax6.set_title("binomial(n=100, p=0.6)",fontsize=10)
ax7.set_title("binomial(n=100, p=0.7)",fontsize=10)
ax8.set_title("binomial(n=100, p=0.8)",fontsize=10)
ax9.set_title("binomial(n=100, p=0.9)",fontsize=10)
fig.subplots_adjust(hspace=0.7)
fig.subplots_adjust(wspace=0.7)
plt.show()
Distribution of Number of Heads for n coin
tosses
• The bimonial(𝑛, 0.5) distribution is shown for 𝑛 = 10 𝑡𝑜 90 by step of 10. With an increase in 𝑛 the distributions shifts to the right
because the distribution is centered around mean, 𝑛/2, which increases with 𝑝. The distribution is symmetric around the expected
value 𝑛/2. The spread of the distribution increases with 𝑛. This is justified from the formula of the standard deviation 𝑛𝑝 1
− 𝑝
that increases with value of 𝑛. Due to increase in spread the distribution covers a wider range of values.
import numpy as np
from scipy.stats import binom
import matplotlib.pyplot as plt
fig, ((ax1, ax2, ax3),(ax4, ax5, ax6),
(ax7, ax8, ax9)) = plt.subplots(3,3)
x=np.arange(0,91)
ax1.plot(x,binom.pmf(x,n=10,p=0.5))
ax2.plot(x,binom.pmf(x,n=20,p=0.5))
ax3.plot(x,binom.pmf(x,n=30,p=0.5))
ax4.plot(x,binom.pmf(x,n=40,p=0.5))
Distribution of Number of Heads for n coin
tosses
ax5.plot(x,binom.pmf(x,n=50,p=0.5))
ax6.plot(x,binom.pmf(x,n=60,p=0.5))
ax7.plot(x,binom.pmf(x,n=70,p=0.5))
ax8.plot(x,binom.pmf(x,n=80,p=0.5))
ax9.plot(x,binom.pmf(x,n=90,p=0.5))
ax1.set_title("binomial(n=10, p=0.5)",fontsize=10)
ax2.set_title("binomial(n=20, p=0.5)",fontsize=10)
ax3.set_title("binomial(n=30, p=0.5)",fontsize=10)
ax4.set_title("binomial(n=40, p=0.5)",fontsize=10)
ax5.set_title("binomial(n=50, p=0.5)",fontsize=10)
ax6.set_title("binomial(n=60, p=0.5)",fontsize=10)
ax7.set_title("binomial(n=70, p=0.5)",fontsize=10)
ax8.set_title("binomial(n=80, p=0.5)",fontsize=10)
ax9.set_title("binomial(n=90, p=0.5)",fontsize=10)
fig.subplots_adjust(hspace=0.7)
fig.subplots_adjust(wspace=0.7)
plt.show()
Example
For the US presidential elections, there are 4 races. In each race, Republicans have 60% chances of winning. If
each race is independent of each other, what is the probability that
• The Republicans will win 0 races, 1 race, 2 race, 3 race or all the 4 races
• The Republicans will win at least one race
• The Republicans will win the majority of the races
Let 𝑋 equals the number of races
4!
• 4
0 𝑝0𝑞4 = 0! 4–0 !
0.600.44 = 0.44 = 0.0256
4!
• 4
1 𝑝1𝑞3 = 1! 4–1 ! 0.6
10.43 = 4 × 0.61 × 0.43 = 0.1536
4!
• 4
2 𝑝2𝑞2 = 2! 4–2 ! 0.6
20.42 = 6 × 0.62 × 0.42 = 0.3456
4!
• 4
3 𝑝3𝑞1 = 3! 4–3 ! 0.630.41 = 4 × 0.63 × 0.41 = 0.3456
4!
• 4
𝑝4𝑞0 = 4! 4–4 ! 0.640.40 = 0.64 × 0.40 = 0.129
• b. 𝑃 at least 1 = 1 − 𝑃 0 = 0.9744 or 𝑃 1 + 𝑃 2 + 𝑃 3 + 𝑃 4 =
0.9744
• c. 𝑃 Republicans wins the majority = 𝑃 3 + 𝑃 4 = 0.4752
Example – Using Python
>>> from scipy.stats import binom
>>> binom.pmf(k=0, n=4, p=0.6)
0.025600000000000008
>>> binom.pmf(k=1, n=4, p=0.6)
0.15360000000000007
>>> binom.pmf(k=2, n=4, p=0.6)
0.3456000000000001
>>> binom.pmf(k=3, n=4, p=0.6)
0.3456000000000001
>>> binom.pmf(k=4, n=4, p=0.6)
0.1296
>>> 1-binom.cdf(k=0,n=4,p=0.6)
0.9744
>>> 1- binom.cdf(k=2,n=4,p=0.6)
0.47520000000000007
Example
• We use 1- binom.cdf(k=2,n=4,p=0.6) to check the probability of 3 success or more
whereas binom.cdf(k=2,n=4,p=0.6) to check the probability of 2 successes or
less.
>>> binom.cdf(k=2,n=4,p=0.6)
0.5247999999999999
>>> x = np.arange(0,10)
>>> n, p = 100, 0.01
>>> fig, ax = plt.subplots(1, 1)
>>> ax.plot(x,binom.pmf(x,n,p),lw=5,alpha=0.5)
>>> ax.vlines(x,0,binom.pmf(x,n,p),lw=5,alpha=0.5)
>>> plt.title('b(100,1/100)')
>>> plt.show()
Binomial (1000,0.001) distribution
• A box contains 1 red ball and 999 white balls. The distribution of
number of red ball picked in 1000 random draws with replacement
is as follows
Define – Poisson Distribution
• It can be shown that the binomial distributions are always concentrated around a
smaller number of values with mean value . The shape of the distribution
1
approaches a limit with and
n
• When expected value is constant and the approaches the
1
limits
binomial and , we get the Poisson distribution with parameter .
n
• So, if is large is small, the distribution of success of independent trials
and depend . The Poisson approximation states
on value of k
success) = –μ
k!
Example
• A manufacturing process produces defective items in a long run.
What is the probability of getting 2 or more defective samples in sample
of 200 items?
• Mean . We can use Poisson approximation
or more defectives
0
1
–2 –2
–2
ax1.set_title("n=1000, p=0.5")
ax2.set_title("n=1000, p=0.1")
ax3.set_title("n=1000, p=0.01")
ax4.set_title("n=1000, p=0.001")
plt.show()
Continuous Random Variable
Continuous Random Variable
• The section is about continuous random variables.
• In continuous random variables, there is an uncountably infinite
number of real numbers in a given range.
• Due to which it is impossible to get the probability of a particular
value, as in the case of discrete random variable.
• The probability of getting a particular value is zero.
• So, in the case of continuous random variables, we use probability
density function and use calculus to compute the probabilities.
Probability Distribution Function
• In the case of histograms, we studied the concept of calculating the
probability of falling within an interval.
• As the number of the intervals goes to infinity, and the width of interval or
bin goes to zero, the relative frequency histogram would almost be a
smooth curve.
• This smooth curve is called the probability density function.
• The height of the probability function does not represent the probability
at that point.
• Infect at every point the probability is zero. We can find how dense the
probability is at a given point by measuring the height of the curve.
Probability Distribution Function
• The total area under the curve is one. Since the area under the curve
can be calculated using integration
–∞
Probability Distribution Function
• The area of the histogram that lies between the interval
gives the proportion of the
observations that line in the interval. The
proportion of the area represents the probability that the random
variable falls in the interval .
b
a
Probability Distribution Function
• The relative frequency distribution curve can
take several types of shapes.
• If the know the function that represents the
curve, we can find out the area under the whole
curve or between an interval using the
integration.
• Fortunately, the functions for many of these
curves are known and ready to use.
• Example: The probability distribution function of
the score obtained by the students is known.
We can find the probability that a particular
student will score more than by
calculating the shaded
area.
Expected Value and Variance
• For a relative frequency histogram, as the number of bars increase without
bound, the width of each bar gets closer and closer to zero. In the limit, the
midpoint of each bar that contain gets closer and closer to . The
height of the bar that contains approaches to . This is
also known as relative frequency density. In the limit the relative frequency
density
closer togets
probability density. The expected value of the random variable
∞
–∞
2
• The expected value gives us variance
∞
2 2 2 2
–∞
Discrete vs Continuous Distribution
Discrete Distribution Continuous Distribution
Point Probability Infinitesimal Probability
𝑃 𝑋 = 𝑥 = 𝑃(𝑥) 𝑃 𝑋 ∈ 𝑑𝑥 = 𝑓(𝑥)𝑑𝑥
𝑃 𝑥 is the probability that random variable 𝑋 has The probability per unit length (density 𝑓 ( 𝑥 ) ) for
integer value 𝑥 value near 𝑥
Discrete vs Continuous Distribution
Discrete Distribution Continuous Distribution
Interval Probability Interval Probability
b
𝑃 𝑎≤ 𝑋≤ 𝑏 = Σ 𝑃(𝑥) 𝑃 𝑎≤ 𝑋≤ 𝑏 = ƒ 𝑓 𝑥 𝑑𝑥
a≤x≤b a
1 Area under the graph between 𝑎 and 𝑏
1
Relative area under histogram between 𝑎 − and 𝑏 +
2
2
Constraints Constraints
Non negative with sum 1 Non negative with total integral 1
∞
𝑃 𝑥 ≥ 0 for all 𝑥 and ∑ a l l x 𝑃 𝑥 = 1 𝑓 𝑥 ≥ 0 for all 𝑥 and ƒ – ∞ 𝑓 𝑥 𝑑𝑥 = 1
Expectation Expectation
∞
𝐸 𝑋 = Σ 𝑥 × 𝑓(𝑥) 𝐸 𝑋 = ƒ 𝑥 𝑓 𝑥 𝑑𝑥
all x
–∞
Uniform Distribution
Uniform Distribution
• The random variable has a uniform distribution if its probability
density function is constant on and everywhere else.
for
for
length
length
• If has a uniform distribution, the probability that is correct
to two decimal places
Expected Value and Variance
Expected Value
The expected value of for the uniform distribution is given by
For uniform distribution, the expected value of
1
2
Variance
2
The Variance of can be given by
For the uniform (0,1) distribution, the variance of
Expected Value and Variance
>>> from scipy.stats import uniform
>>> uniform.mean()
0.5
>>> uniform.var()
0.08333333333333333
Using Python
from scipy.stats import uniform
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
x = np.linspace(0.01,0.99, 100)
ax.plot(x, uniform.pdf(x),'r-', lw=5, alpha=0.6, label='uniform pdf')
r = uniform.rvs(size=1000)
ax.hist(r, density=True, histtype='stepfilled', alpha=0.2)
ax.legend(loc='best', frameon=False)
plt.show()
Normal
Distribution
Normal Distribution
• In a large population, many variables follow bell shape relative
frequency distribution.
• These bell shape relative frequency distributions are symmetric and
they are relatively higher in the middle than at the extremes.
• Examples of such distributions are price fluctuation of commodity in
the market, stochastic aptitude test scores, physical measurement
(height, weight, length) of an organism, etc.
• Each of these bell shape curve can be approximated using a normal
curve.
Normal Distribution
• The normal distribution (also known as Gaussian distribution) is used
to approximate a large number of probabilities distribution, so the
normal distribution is the most widely used distribution in
• statistics.
In general, the notation for normal distribution is 2
identifies a normal random variable with and 2
mean 2 can be
variance
written as
• The normal distribution equation for
1 2 2
– 2 x– μ /σ
For .
Normal Distribution
The equation consists of two fundamental constants 𝜋 =
3.14159265358 … and the base of natural algorithm 𝑒 =
2.7182818285 …
The equation of normal curve also consists of two parameters 𝜇,
the mean and 𝜎, the standard deviation.
The equation has the term 2𝜋𝜎 in denominator so that the total
area under the curve is 1.
The mean, 𝜇, can be a positive or negative real number.
The 𝜇 signifies the location of the curve.
The standard deviation, 𝜎, can only be a positive
number.
Standard deviation makes a horizontal scale and measures the
spread of the distribution.
The curve is symmetric around 𝜇. From 𝜇 to (𝜇 ± 𝜎), the curve
is concave. Beyond inflection points 𝜇 ± 𝜎), the curve becomes
convex.
Normal Distribution Equation
1– x—μ 2
• The 2 σ2 determines the shape of
term 𝑒
• the curve.the term 1 does not change the
Whereas 2πσ
basic shape of the curve.
• It just changes the area under the curve that
should equate to 1.
1
• If we denote the term 𝑘 = , the equation
x—μ 2
2πσ
–1 2
becomes 𝑘𝑒 2 σ .
1 x—μ 2
• The curve 𝑦 = 𝑘𝑒– 2 σ2 for several values
of 𝑘 is as follows
Normal Distribution Equation
• As mentioned before, the shape of the curve depends on the values
of and .
• By changing the values of and , we can alter the location and the
spread.
• Even by changing the and , we always get a bell shape
curve that
values of is mound around the mean.
• The peak of the normal distribution lies around the mean
.
Normal Distribution
• By changing the value of we can slide
the distribution along axis.
• the
For example, if we increase the value of
by 4, the whole distribution shifts to
right by 4 points.
• For 2 and smaller values of ,
curve is thin and tall and the values are
the
piled up around .
• For higher values of , the values
are more
• dispersed
For 2 around . in data set
, all the values 2
equal to zero. The
are is the
distribution of constant values with
probability one at .
Normal Distribution
>>> import numpy as np
>>> from scipy.stats import norm
>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-7,7,100)
>>> plt.plot(x,norm.pdf(x,loc=0,scale=1),label="mu=0, sigma=1")
>>> plt.plot(x,norm.pdf(x,loc=0,scale=2),label="mu=0, sigma=2")
>>> plt.plot(x,norm.pdf(x,loc=0,scale=2/3),label="mu=0,
sigma=2/3")
>>> plt.plot(x,norm.pdf(x,loc=4,scale=2/3),label="mu=4,
sigma=2/3")
>>> plt.legend(loc="best")
>>> plt.show()
Standard Normal Distribution
The equation of normal curve with and 2 can be written as
1 2
–2 z
x–μ
Where as shows how many standard deviations away the value is from the mean.
σ
If the normal dis tribution has and , we get standard normal distribution.
The area under standard normal curve can be defined as
1 2
–2 z
= 1− Φ 𝑧
For − ∞ < 𝑧 < ∞
Standard Normal Distribution
The probability of the interval for standard normal distribution
can be denoted by
These formulas are used when we work with the normal distribution.
While working with the normal distribution, sketch the standard
normal curve and remember the definition of as the proportion
of area under the curve that is left to .
The three most common standard normal probabilities are
Central Limit Theorem
• We can explain the appearance of the normal distribution in several
contexts can be explained through central limit theorem.
• For independent random variables having same distribution and finite
variance, as number of samples tend to infinity, the distribution of
standardized sum (or average) of variable approaches the
standard normal distribution.
• When we average a large number of independent measurements
whereas each measurement is smaller than the sum of all the
measurements, the distribution of sum approaches the normal shape
even if the shape of individual measurements does not follow.
The probability within one standard deviation of the mean Φ −1,1 ≈ 68.26%
The probability within two standard deviation of the mean Φ −2,2 ≈ 95.44%
The probability within three standard deviation of the mean Φ −3,3 ≈ 99.74%
Example
For a normal distribution 𝑋 ∼ 𝑁(20,22), define the probability that the measurement will be less
than 23
First, we calculate the number of standard deviations the value is away from mean
𝑥 − 𝜇 23 − 20
𝑧= = = 1.5
𝜎 2
So 𝑥 lies 1.5 standard deviation away from mean. We can find the area corresponding to 𝑧 = 1.5 as
• What is the probability that the milk production for a randomly chosen
cow will be less than 60 pounds?
• What is the probability that the milk production for a randomly chosen
cow will be greater than 90 pounds?
• What is the probability that the milk production for a randomly chosen
cow will be between 60 and 90 pounds?
Solution (a)
score is
Area to the left is 0.2206. So the probability that randomly chosen cow
will be less than 60 pounds is 0.2206
Solution (b)
score is
Area to the left is 0.9382. So the probability that randomly chosen cow
will be more than 90 pounds is
Solution (c)
To find the probability between 60 and 90, we have to find area
between 60 and 90.
So, we can say that the production for 22.06% cow is less than 60
pounds, 71.76% is between 60 and 90 pounds and 6.18% is more than
90 pounds
Example
The score for2 an entrance exam follows the normal distribution with
What proportion of the students taking exam will score below . Also calculate
lower percentile of all scores
The value is
For second part, we need to find the 𝑧 value corresponding to probability 10%
𝑥 − 500
𝑧 = −1.28 =
100
𝑡 = 𝑋¯
𝑆/√𝑛
− 𝜇
The degree of freedom
• For a dynamic system, the number of degrees of freedom is the minimum
number of independent coordinated that are used to completely specify
the position of the system.
• For statistics, the number of degrees of freedom is the number of
measurements that are free to vary in the calculation of statistic.
• The degrees of freedom can also be defined as the number of independent
measurements of a sample of data that we can use to estimate a
parameter of population from which the sample is drawn.
• For example, if we have measurements, for mean, we have
independent observation and degrees of freedom is . For variance,
one degrees of freedom is lost to calculate , so we have
degrees of freedom.
Normal distribution vs t-distribution
• Initially the statistics focused on finding probability and inference using
large samples and the normal distribution. The standard normal
distribution results into the bell-shaped probability distribution for large
samples but for small samples results into larger probability areas in the
trails of the distribution.
• The standard normal distribution and student t-distribution both are
symmetrical and have a mean of zero. The standard normal distribution
is
bell shape and has a standard deviation of one whereas t-distribution is
uni-model and has a standard deviation that is not equal to one.
• The standard deviation of the t-distribution varies. For a small sample
size,
the t-distribution is more peaked (leptokurtic). Compared to the
standard
normal distribution, t-distribution’s probability areas in the tail are
higher.
In other words the probability density of the t-distribution is lower in the
centre and heavier in the tail.
Normal distribution vs t-distribution
Following graph shows the density of t-distribution with an increase in degrees of freedom 𝜈. With the increase
in the value of 𝜈, the t-distribution becomes closer to normal distribution
>>> import numpy as np
>>> from scipy.stats import norm
>>> from scipy.stats import t
>>> import matplotlib.pyplot as plt
>>> x = np.linspace(-5,5,100)
>>> plt.plot(x,norm.pdf(x),label="Normal")
>>> plt.plot(x,t.pdf(x,1), label="T DF =1")
>>> plt.plot(x,t.pdf(x,5), label="T DF =5")
>>> plt.plot(x,t.pdf(x,10), label="T DF =10")
>>> plt.plot(x,t.pdf(x,30), label="T DF =30")
>>> plt.legend(loc="best")
>>> plt.show()
Exponential Distribution
• Let’s study a continuous distribution function which is closely related to
Poisson distribution.
• For Poisson point process, we count the number of occurrences in a given
interval.
• This is a discrete type random variable and follows Poisson distribution.
• Along with the number of occurrences, waiting time (or distance) between
successive occurrences is also a random variable.
• For example, waiting time between the arrival of email; or the waiting time
between phone calls at a telephone exchange; or the locations of road
accidents on a national highway.
Exponential Distribution
• The distance or span of time between two consecutive point is a
continuous random variable .
• The random variable can take any positive value.
• It forms the exponential distribution.
• For the exponential distribution there is only one parameter
• The exponential distribution is often used to model the failure of the
objects.
• The failures form a Poisson process in time and the time to next
failure is exponentially distributed.
Exponential Distribution
Exponential Distribution: The probability density function of a continuous random
variable that follows exponential distribution is
–λx
Where is a parameter
Where and
α–1
The shape of the curve that is
Г α+β
determined by β – 1 is called
a
Mean
a+b
ab
Variance
a+b 2 a+b+1
Gamma family of distributions
The distribution is used for a non-negative continuous random
variable . The two parameters
a
a – 1 – bx
γ
fig.subplots_adjust(hspace=0.7)
fig.subplots_adjust(wspace=0.7)
plt.show()
Gamma family of distributions
Mean and Variance of
Mean -
Variance - 2
Chi-Square distributions
The chi-square 2 distribution with degree of freedom is a special
r 1
case of distribution with
2 2
r x
2 –1 – 2
r
2