Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18
ASANSOL ENGINEERING COLLEGE
Probability and Distribution
PRESENTED BY: NAME: 1.ROHAN DWIVEDI 2.SARBASIS MRINAL BANERJEE 3.SMRITI SADHU ROLL NO: 1.10830621013 2.10830621014 3.10830621015 SUBJECT: PROBABILITY & STATISTICS PAPER CODE: PCCAIML 501 SEMESTER: 4TH. DEPARTMENT: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING DISTRIBUTION ● Frequency Distribution: It is a listing of observed / actual frequencies of all the outcomes of an experiment that actually occurred when experiment was done. ● Probability Distribution: It is a listing of the probabilities of all the possible outcomes that could occur if the experiment was done. ○ It can be described as: ■ A diagram (Probability Tree) ■ A table ■ A mathematical formula PROBABILITY DISTRIBUTION ● Discrete Distribution: Random Variable can take only limited number of values. Ex: No. of heads in two tosses.
● Continuous Distribution: Random Variable can take any value. Ex:
Height of students in the class. BINOMIAL DISTRIBUTION ● There are certain phenomena in nature which can be identified as Bernoulli's processes, in which: ○ There is a fixed number of n trials carried out ○ Each trial has only two possible outcomes say success or failure, true or false etc. ○ Probability of occurrence of any outcome remains same over successive trials ○ Trials are statistically independent
● Binomial distribution is a discrete PD which expresses the probability of one set
of alternatives - success (p) and failure (q) ○ P(X = x) = nCr pr qn - r (Prob. Of r successes in n trials) ○ n = no. of trials undertaken or=no. of successes desired ○ p = probability of success ○ q = probability of failure MEASURES OF CENTRAL TENDENCY AND DISPERSION FOR THE BINOMIAL DISTRIBUTION ● Mean of BD: u =n0
● Standard Deviation of BD: sigma = sqrt(npq)
FITTING OF BINOMIAL DISTRIBUTION The fitting of the binomial distribution means that we try to obtain the frequency distribution of the given data set assuming that it follows the binomial distribution.
Four coins are tossed 160 times and the following results were obtained:
Fit a binomial distribution under the assumption that the coins are unbiased.
Fit a binomial distribution to the following data
POISSON DISTRIBUTION When there is a large number of trials, but a small probability of success, binomial calculation becomes impractical
If lambda = mean no. of occurrences of an event per unit interval of
time/space, then probability that it will occur exactly prime x' times is given by
P(x) = f(x) =(e– λ λx)/x! where e is napier constant & e = 2.7182
CHARACTERISTICS OF POISSON DISTRIBUTION ● It is a discrete distribution ● Occurrences are statistically independent ● Mean no. of occurrences in a unit of time is proportional to size of unit (if 5 in one year, 10 in 2 years etc.) ● Mean of PD is λ = np ● Standard Deviation of PD is √λ = √np ● It is always right skewed. ● PD is a good approximation to BD when n> or = 20 and p< or = 0.05 NORMAL DISTRIBUTION ● It is a continuous PD i.e. random variable can take on any value within a given range. Ex: Height, Weight, Marks etc. ● Developed by eighteenth century mathematician - astronomer Karl Gauss, so also called Gaussian Distribution. ● It is symmetrical, unimodal (one peak). ● Since it is symmetrical, its mean, median and mode all coincides i.e. all three are same. ● The tails are asymptotic to horizontal axis i.e. curve goes to infinity without touching horizontal axis. ● X axis represents random variable like height, weight etc. ● Y axis represents its probability density function. ● Area under the curve tells the probability. ● The total area under the curve is 1 (or 100%) DEFINING A NORMAL DISTRIBUTION ● Only two parameters are considered: Mean & Standard Deviation ○ Same Mean, Different Standard Deviations ○ Same SD, Different Means ○ Different Mean & Different Standard Deviations AREA UNDER THE NORMAL CURVE AREA UNDER THE CURVE ● The mean ± 1 standard deviation covers approx. 68% of the area under the curve ● The mean ± 2 standard deviation covers approx. 95.5% of the area under the curve ● The mean ± 3 standard deviation covers 99.7% of the area under the curve STANDARD NORMAL PD ● In standard Normal PD, Mean = 0, SD = 1 ● Z = (x - μ) / σ ○ Z= No. of std. deviations from x to mean. Also called Z Score ○ x = value of RV THANK YOU