0% found this document useful (0 votes)
21 views

Probability

The document discusses probability and the normal distribution. It defines key concepts like random variables, probability distributions, and the binomial distribution. It also explains how to calculate probabilities of events occurring together or either occurring using rules like the multiplication rule and addition rule. Examples are provided to illustrate probability calculations.

Uploaded by

Neeraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Probability

The document discusses probability and the normal distribution. It defines key concepts like random variables, probability distributions, and the binomial distribution. It also explains how to calculate probabilities of events occurring together or either occurring using rules like the multiplication rule and addition rule. Examples are provided to illustrate probability calculations.

Uploaded by

Neeraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

PROBABILITY,

NORMAL
DISTRIBUTION
INDUCTIVE
STATISTICS
PROBABILITY
RANDOM CIRCUMSTANCES
Random circumstance is one in which
the outcome is unpredictable.
Example: Disease Status
You have the disease
You do not have the disease

Test for the disease is positive


Test for the disease is negative
3
ASSIGNING PROBABILITY
HOW LIKELY IT IS THAT A
PARTICULAR OUTCOME WILL BE THE
RESULT OF A RANDOM
CIRCUMSTANCE

In situations that we can imagine repeating


many times, we define the probability of a specific
outcome as the proportion of times it would occur
over the long run -- called the relative frequency
of that particular outcome.
4
RANDOM VARIABLE
 “Numerical outcome of a random circumstance”
 Random Variable: assigns a number to
each outcome of a random circumstance, or,
equivalently, to each unit in a population.
 Probability: what is the chance that a given event will occur?
 Definition: Assigning Probabilities to Simple Events

 N = exhaustive events

 n = favorable events

 A = event

 P(A) = probability of the event A

 = no. of favourable events for A / all possible outcome

=n/N
Conditions for Probabilities
for Random Variables

Condition 1
The sum of the probabilities over all possible
values of a discrete random variable must equal
1.
Condition 2
The probability of any specific outcome for a
discrete random variable must be between 0 and
1. 7
PROBABILITY DEFINITIONS
AND RELATIONSHIPS
Sample space: collection of unique, nonoverlapping
possible outcomes of a random circumstance.
Simple event: one outcome in the sample space; a
possible outcome of a random circumstance.
Event: a collection of one or more simple events in
the sample space; often written as
A, B, C, and so on.
8
Equally Likely Simple Events
If there are k simple events in the sample space
and they are all equally likely, then the probability
of the occurrence of each one is 1/k.

9
EXAMPLE: HOW MANY GIRLS ARE LIKELY?
Family has 3 children. Probability of a girl is ?
What are the probabilities of having 0, 1, 2, or 3 girls?

Sample Space: For each birth, write either B or G.


There are eight possible arrangements of B and G
for three births. These are the simple events.
Sample Space and Probabilities: The eight simple
events are equally likely.
Random Variable X: number of girls in three births.
For each simple event, the value of X is the number
of G’s listed.
10
EXAMPLE: HOW MANY GIRLS? (CONT)
Value of X for each simple event:

Probability distribution function for Number of Girls X:

Graph of the pdf of X:

11
EXAMPLE:
 Rolling a die.
 The chance of rolling a 2 is 1/6, because there is
a 2 on one face and a total of 6 faces.
 So, assuming the die is balanced, a 2 will come
up 1 time in 6.
Example: Probability of Simple Events
Random Circumstance:
A three-digit winning lottery number is selected.
Sample Space: {000,001,002,003, . . . ,997,998,999}.
There are 1000 simple events.
Probabilities for Simple Event: Probability any specific
three-digit number is a winner is 1/1000.
Assume all three-digit numbers are equally likely.

Event A = last digit is a 9 = {009,019, . . . ,999}.


Since one out of ten numbers in set, P(A) = 1/10.
Event B = three digits are all the same
= {000, 111, 222, 333, 444, 555, 666, 777, 888, 999}.
Since event B contains 10 events, P(B) = 10/1000 = 1/100. 13
Mutually Exclusive Events
One event excludes the probability of occurrence of the
other specified event or events then the events are called
as mutually exclusive events
Notation: AC represents the compliment of A.

Note: P(A) + P(AC) = 1


Example: tossing coin once
A = getting head
AC = not getting head
P(A) = 1/2 so P(AC) = 1/2
14
Independent and Dependent Events

• Two events are independent of each other


if knowing that one will occur (or has
occurred) does not change the probability
that the other occurs.
• Two events are dependent if knowing that
one will occur (or has occurred) changes
the probability that the other occurs.

15
Probability That Either of Two Events
Happen/ “or” rule/ addition theorem

Rule 2 (addition rule for “either/or”):


Rule 2a (general):
P(A or B) = P(A) + P(B) – P(A and B)
Rule 2b (for mutually exclusive events):
If A and B are mutually exclusive events,
P(A or B) = P(A) + P(B)
16
 The probability that either one of 2
different events will occur is the sum of
their separate probabilities.
 For example, the chance of rolling either a
2 or a 3 on a die is 1/6 + 1/6 = 1/3.
Probability That Two or More Events
Occur Together / AND rule

Rule 3 (multiplication rule for “and”):


Rule 3a (general): P of A given B
P(A and B) = P(A)P(B|A)
Rule 3b (for independent events):
If A and B are independent events,
P(A and B) = P(A)P(B)
Extension of Rule 3b (for > 2 indep events):
For several independent events,
P(A1 and A2 and … and An) = P(A1)P(A2)…P(An) 18
THE AND RULE OF PROBABILITY
 The probability of 2 independent events both
happening is the product of their individual
probabilities.
 Called the AND rule because “this event
happens AND that event happens”.
 For example, what is the probability of rolling
a 2 on one die and a 2 on a second die? For
each event, the probability is 1/6, so the
probability of both happening is 1/6 x 1/6 =
1/36.
 Prob of getting twins = 1/80
 Prob of Rh negative child = 1/10
 So prob of getting Rh –ve twins

= 1/80X1/10
= 1/800
NOT RULE
 The chance of an event not happening is 1
minus the chance of it happening.
 For example, the chance of not getting a 2 on a
die is 1 - 1/6 = 5/6.
 This rule can be very useful. Sometimes
complicated problems are greatly simplified by
examining them backwards.
 Possible Outcomes of Coin Flipped 3 times
 HHH HHT THH HTH HTT THT TTH TTT
 Now P (2H , 1T) = 3/8
 Addition rule = P(HHT or HTH or THH)
 = P(HHT) + P(HTH) + P(THH)
 = 1/8 + 1/8 + 1/8 = 3/8
 Multiplication rule
 = P(H). P(H). P(T)+P(H). P(T). P(H) +
P(T).P(H).P(H)
 =1/2.1/2.1/2+1/2.1/2.1/2+1/2.1/2.1/2= 3/8
BINOMIAL RANDOM VARIABLES
Binomial -- results from a binomial experiment.

Conditions for a binomial experiment:


1. There are n “trials” where n is determined in advance
and is not a random value.
2. Two possible outcomes on each trial, called “success”
and “failure” and denoted S and F.
3. Outcomes are independent from one trial to the next.
4. Probability of a “success”, denoted by p, remains same
from one trial to the next. Probability of “failure” is 1 – p.

23
BINOMIAL PROBABILITY
DISTRIBUTION
Example: Flip a coin 3 times the possible outcomes are (call
heads = hits; tails = misses):
Possible Outcomes of Coin
Flipped 3 times Frequency Dist of data
Outcome No. Hits (x)
X F
HHH 3 (FREQUENCY)
HHT 2
3 1
THH 2
HTH 2 2 3
HTT 1 1 3
THT 1
TTH 1 0 1
TTT 0
24
LARGER FAMILIES: BINOMIAL
DISTRIBUTION
 The binomial distribution is a
shortcut method based on the
expansion of the equation to the

( p  q)  1
n
right, where p = probability of
one event (say, a normal child),
and q = probability of the
alternative event 9mutant child).
n is the number of children in the
family.
 Since 1 raised to any power
(multiplied by itself) is always
equal to 1, this equation
describes the probability of any
size family.
BINOMIAL FOR A FAMILY OF 2
 The expansion of the binomial for n = 2 is shown.
The 3 terms represent the 3 different kinds of families:
p2 is families with 2 normal children, 2pq is the
families with 1 normal and 1 mutant child, and q 2 is
the families with 2 mutant children.
 As before, p = 3/4 and q = 1/4.
 Chance of 2 normal children = p2 = (3/4)2 = 9/16.
 Chance of 1 normal plus 1 mutant = 2pq = 2 * 3/4 *
1/4 = 6/16 = 3/8.

p  2 pq  q
2 2
BINOMIAL FOR A FAMILY OF 3
 Here, p3 is a family of 3 normal children, 3p2q is 2 normal plus
1 affected, 3pq2 is 1 normal plus 2 affected, and q3 is 3 affected.

 The exponents on the p and q represent the number of children


of each type.
 The coefficients are the number of families of that type.
 Chance of 2 normal + 1 affected is described by the term 3p 2q.
Thus, 3 * (3/4)2 * 1/4 = 27/64. Same as we got by enumerating
the families in a list.

p  3 p q  3 pq  q  1
3 2 2 3
LARGER FAMILIES

( p  q)  1 n

 Exponents are easy: you just systematically vary the


exponents on p and q so they always add to n. Start
with pnq0 , do pn-1q1, then pn-2q2, ………….,p0qn .
 Probabilityfrom the shape of normal
distribution or normal curve
The Empirical Rule
Standard Normal Distribution: µ = 0 and  = 1
99.7% of data are within 3 standard deviations of the mean

95% within
2 standard deviations

68% within
1 standard deviation

34% 34%
2.4% 2.4%
0.1% 0.1%
13.5% 13.5%
30
x - 3s x - 2s x -s x x + s x + 2s x + 3s
 Probability of calculated values from
tables
 It is determined by referring to the
respective tables.
APPLICATIONS
1. Determine sensitivity & specificity of a
diagnostic test
2. Determine chance of success or failure of a
specific treatment
3. Solve most transmission problems
4. Determine the effect of a certain exposure of
an outcome of disease
5. To study survival pattern of two or more
groups of patients receiving different
treatment.
NORMAL RANDOM VARIABLES
If a population of measurements follows a normal
curve, and if X is the measurement for a randomly
selected individual from that population, then
X is said to be a normal random variable
X is also said to have a normal distribution
Any normal random variable can be completely
characterized by its mean, m, and standard deviation, s.

33
NORMAL DISTRIBUTIONS (OF NORMAL
RANDOM VARIABLES)

Just as there are many different uniform distributions


(with different ranges of values), there are also many
different normal distributions, with each one depending
on 2 parameters: the population mean and population SD.
The standard normal distribution is a normal
probability distribution that has a mean = 0
and a SD = 1.0, and the total area under its
density curve = 1.0s
34
NORMAL DISTRIBUTION
 Bell shaped, symmetrical, measures of central
tendency converge
mean, median, mode are equal in normal
distribution
Mean lies at the peak of the curve
 Many events in nature follow this curve
IQ test scores, height, tosses of a fair coin,
user performance in tests,

35
THE NORMAL CURVE

NB:
NB:position
positionof
of
measures
measuresofof
central
centraltendency
tendency
50% of scores
f fall below mean

Mean
Median
Mode
36
EXAMPLE NORMAL CURVES (HEIGHTS; MALES; FEMALES)

Women:
µ = 63.6 Men:
 = 2.5 µ = 69.0
 = 2.8

63.6 69.0
Height (inches)
37
Normal Curve Characteristics
• The curve is bell-shaped and symmetrical.
• The mean, median, and mode are all equal.
• The highest frequency is in the middle of the curve.
• The frequency gradually tapers off as the scores
approach the ends of the curve.
• The curve approaches, but never meets, the abscissa
at both high and low ends.

38
The Empirical Rule
Standard Normal Distribution: µ = 0 and  = 1
99.7% of data are within 3 standard deviations of the mean

95% within
2 standard deviations

68% within
1 standard deviation

34% 34%
2.4% 2.4%
0.1% 0.1%
13.5% 13.5%
39
x - 3s x - 2s x -s x x + s x + 2s x + 3s
POSITIVELY SKEWED DISTRIBUTION

mode < median < mean

Mode Median Mean 40


Negatively skewed distribution
Mean < median < mode

Mean Median
Mode 41
OTHER DISTRIBUTIONS
 Bimodal
Data shows 2 peaks
 Multimodal
More than 2 peaks

The shape of the underlying distribution


determines your choice of inferential test

42
BIMODAL

Mode Mean Mode 43


Median
DEVIATION UNITS: Z SCORES
Any data point can be expressed in terms of its
Distance from the mean in SD units:

xx
z
sd
A positive z score implies a value above the mean
A negative z score implies a value below the mean 44
INTERPRETING Z SCORES
 By using Z scores, we
 Mean = 70,SD = 6
 Then a score of 82 is 2
can standardize a set of
sd [ (82-70)/6] above the scores to a scale that is
mean, or 82 = Z score of more intuitive
2  Many IQ tests and
 Similarly, a score of 64 = aptitude tests do this,
a Z score of -1 setting a mean of 100 and
an SD of 10 etc.

45
REMEMBER:
AZ score reflects position in a normal
distribution
 The Normal Distribution has been plotted out
such that we know what proportion of the
distribution occurs above or below any point

46
IMPORTANCE OF DISTRIBUTION
 Given the mean, the standard deviation, and
some reasonable expectation of normal
distribution, we can establish the confidence
level of our findings

 With a distribution, we can go beyond


descriptive statistics to inferential statistics (tests
of significance)

47
INFERENCE IS BUILT ON PROBABILITY
 Inferential statistics rely on the laws of
probability to determine the ‘significance’ of the
data we observe.
 Statistical significance is NOT the same as
practical significance
 In statistics, we generally consider ‘significant’
those differences that occur less than 1:20 by
chance alone

48
 Statistical inference is the process of drawing
conclusions from data that are subject to random
variation, for example, observational errors or sampling
variation.
 Statistical inference, statistical induction and
inferential statistics are used to describe systems of
procedures that can be used to draw conclusions from
datasets arising from systems affected by random
variation
THE MEANING OF STATISTICS
HAS SEVERAL MEANINGS
 Collections of numerical  Last year’s enrollment
data figures
 Summary measures  Average enrollment per
calculated from a month last year
collection of data
 Activity of using and
 Evaluators made a
interpreting a collection
projection of next year’s
of numerical data
enrollments
DESCRIPTIVE STATISTICS
 Use of numerical information to summarize,
simplify, and present masses of data.
 Organized and summarized for clearer presentation

 For ease of communications

 Data may come from studies of populations (often


called a census study) or samples
INFERENTIAL STATISTICS
 To generalize or predict how a large group will
behave based upon information taken from a part
of the group is called INFERENCE
 Techniques which tell us how much confidence
we can when we GENERALIZE from a sample
to a population
EXAMPLES OF DESCRIPTIVE AND
INFERENTIAL STATISTICS
Descriptive Statistics Inferential Statistics
 Graphical  Confidence interval
 Arranging data in tables  Margin of error
 Bar graphs and pie charts  Compare means of two
 Numerically samples
 Percentages  Pre/post scores
 Averages  t Test

 Range  Compare means from three


 Relationships samples
 Pre/post and follow-up
 Correlation coeficient
 ANOVA = analysis of variance
 Regression analysis
WHAT IS MEANT BY A MEANINGFUL
STATISTIC (SIGNIFICANT)?
 Statistics,descriptive or inferential are NOT a
substitute for good judgment
 WE (evaluators and policy makers) decide what level
or value of a statistic is meaningful
 State our judgment before gathering and analyzing
data
 Examples:
 Score on performance test of 80% is passing
 Pre/post job aid for safety reduces accidents by 50%
INTERPRETATION OF MEANING
 Census Measure (statistic)
 There is no sampling error
 The number you have is “real”
 Judge against pre-set standard

 Inferential Measure (statistic)


 Tellshow sure (confident) you can be, the number you
have is real
 Judge against pre-set standard and state how certain the
measure is
SUMMARY
 Probability: = relative frequency= chance of occurrence
of an event or events
 Value 0 to 1

 Sum of all probabilities in one random expt = 1

 Normal curve, mean=mode=median

 Sd=1

 Inductive / inferential statistics = drawing conclusion


THANK YOU

You might also like