Statistics For Economists - Huye
Statistics For Economists - Huye
n r n r
Permutations without Repetition
When r=n , the number of different Arrangements
becomes
P n n 1 1 n !
n n
Permutations with Repetitions
A set consists of n objects of which n1 are of one
type (i.e. indistinguishable from each other), n2
are of a second type, …, nr are of a rth type;
n n1 n2 nk
The number of different permutations of the
objects is
n!
P n1 ,n2 ,
n , nk
n1 ! n2 ! nk !
Example
The number of different permutations of the 10
letters of the word MISSISSIPI is
10!
10 P1,4,4,1
1!! 4! 4! 1!
Combinations without Repetition
The total number of combinations of r objects
selected from n different objects without
considering the order of them (also called the
combinations of n things taken r at a time with )
is given by:
n n!
n Cr
r r ! n r !
n n n 1 n r 1 n Ar
n Cr
Or
r r! r!
Example
The number of ways in which 3 cards can be
chosen or selected from a total of 8 different
cards is 8 8!
C 56
3 3! 8 3!
8 3
Combinations with Repetition
The number of possibilities to choose r objects
from n different ones, repeating each element
arbitrarily and not considering the order is
n r 1 n r 1!
n r 1 Cr
r r ! n 1!
Example
The number of different numbers of 3 digits to
be written from a set of 5 digits is
7 7!
531 C3 7 C3 35
3 3! 4!
Binomial Coefficients
n
n nk k
x y x y
n
k 0 k
n n n n 1 n n 2 2 n n
x x y x y y
0 1 2 n
Example
4
4 4 k k 4 4 4 3 4 2 2 4 3 4 4
x y x y x x y x y x y y
4
k 0 k 0 1 2 3 4
x 4x y 6x y 4x y y
4 3 2 2 3 4
EXERCISES
1.From a group of 7 men and 6 women, five persons are to
be selected to form a committee so that at least 3 men are
there on the committee. In how many ways can it be done?
2. In how many different ways can the letters of the word
'LEADING' be arranged in such a way that the vowels always
come together?
3. In how many different ways can the letters of the word
'CORPORATION' be arranged so that the vowels always
come together?
PROBABILITY
Sample space: The set of all possible outcomes
of a random experiment is called the sample
space
Events: any subset of the sample space
Examples
Set operations on events
A Bis the event “either A or B or both”. A B is
called the union of A and B
A B is the event “both A and B ”. A B is
called the intersection of A and B .
A is the event “not A ”. A is called the
complement of A .
A B A B is the event “ A but not B ”.
Example
Referring to the experiment of tossing a coin
twice, let A be the event “at least one head
occurs” and B the event “the second toss results
is a tail”.
Then , A HT , TH , HH , B HT , TT and so we have
A B HT , TH , HH , TT , A B HT , A B TH , HH ,
A TT
The Concept of Probability
If an event A can occur in h different ways out of
a total of n possible ways, all of which are equally
likely, then the probability of the event is
h
P A
n
If after n repetitions of an experiment, where n is
very large, an event A is observed to occur in h of
these, then the probability of the event is
h
P A
n
Example
If we toss a coin 1000 times and find that it
comes up heads 532 times, we estimate the
probability of a head coming up to be
532
0.532
1000
Axioms of Probability
For every event A in the class C, P A 0
For sure or certain event, P 1
For any number of mutually exclusive events,
P Ai P Ai
i 1 i 1
Theorems
1. If A1 A2 , then
P A1 P A2
and
P A2 A1 P A2 P A1
2. For every event A , 0 P A 1
4. If A
is the complement of A, then, P A 1 P A
5. If A A1 A2 An , where A1 , A2 , , An are
mutually exclusive event, then,
P A P A1 P A2 P An
6. If A and B are two events, then,
P A B P A P B P A B
P A 1 P A 0.80
P B 1 P B 0.70
P A B P A P B P A B 0.20 0.30 0.10 0.40
P A B P A B 1 P A B 1 P A P B P A B 1 0.40 0.60
Conditional Probability
The probability of B given that A has occurred is
P A B
P B A
P A
Similarly, the probability of A given that B has
occurred is P A B
P A B
P B
P A P B A
P A B
P B
Example
Find the probability that a single toss of a die will
result in a number less than 4 if it is given that
the toss resulted in an odd number.
Theorems
1. For any three events A1 , A2 , A3 we have
P A1 A2 A3 P A1 P A2 A1 P A3 A1 A2
j j
j 1
EXAMPLE
An urn B1 contains 2 white and 3 black balls and another urn
B2 contains 3 white and 4 black balls. One urn is selected at
random and a ball is drawn from it. If the ball drawn is found
black, find the probability that the urn chosen was B1
Independents Events
Two events A and B are said to be independent if
and only if
P A B P A P B
If A and B are independent
P A B P A and P B A P B
Three events are independent if and
only if :
1.P A B C P A P B P C
2.P A B P A P B
3.P A C P A P C
4.P B C P B P C
UNIVARIATE RANDOMVARIABLES
A random variable X is a function, which assigns
to each sample point a real number, called
the value of X ( ) at
Example
Suppose that a coin is tossed twice so that the
sample space is HH , HT , TH , TT
Let X represents the number of heads that can
come up
Sample Point HH HT TH TT
X 2 1 1 0
The probability distribution function
PX B P X B
A r.v. is said to be of the discrete type (or just
discrete) if there are countable points x1 , x2 ,
such that 1.PX xi 0, i 1
2. PX x P X x 1
i i
i i
The probability distribution
function(cont)
The function f X
f X xi PX xi P X xi for x xi
f X xi 0
otherwise
has the properties:
f X x 0 x
f x 1
i
X i
and
P X B f X xi
xi B
The function defined on I
satisfies the following properties:
f X x 0 x I
P X x f x dx 1, J I
J
X
X x : X ( ) x
X x : X ( ) x
x1 X x2 : x1 X ( ) x2
P X 2
Distribution Functions
The distribution function (or cumulative
distribution function c.d.f.) of X is the function
defined by
FX ( x) P X x , x
Properties
lim FX ( x) 0
0 FX ( x) 1 x
FX ( x1 ) FX ( x2 ) if x1 x2 lim FX ( x) 1
x
lim FX ( x) FX (a ) FX (a)
x a
Example
Consider the r.v. X defined in the above Example
Find and sketch the c.d.f.FX ( x) of X.
x X x FX ( x)
-1 0
1
TTT
0 8
3 1
4
1
Determination of Probabilities from the
Distribution Function
P a X b FX (b) FX (a)
P X a 1 FX (a)
P X b FX (b )
Discrete Random Variables
Probability Mass Functions
The probability mass function or probability
density function (p.d.f.) or probability
distribution is given by
p X ( x) P X x f ( x)
Properties
0 pX ( xk ) 1, k 1, 2,
pX ( xk ) 1
k
FX ( x) P X x p
xk x
X ( xk )
the distribution function is given by
0 x x1
p (x ) x1 x x2
X 1
F ( x) p X ( x1 ) p X ( x2 ) x2 x x3
p X ( x1 ) p X ( x2 ) p X ( xn ) xn x
Example
Suppose that a coin is tossed twice. Let X
represent the number of heads that come up.
What is the probability function and the c.d.f
The distribution function is
x 0 1 2
1 1 1
p X ( x)
4 2 4
0 x 0
p X (0) 1 0 x 1
4
FX ( x)
p (0) p (1) 3 1 x 2
X X
4
p (0) p (1) p (2) 1 2 x
X X X
And is sketched
Continuous Random Variables
Probability Density Function
the probability density function (p.d.f.) of the
continuous r.v. X is given by
dFX ( x)
fX x
dx
Properties
fX x 0
f X x dx 1
and
P a X b P a X b P a X b P a X b
b
f X x dx FX (b) FX (a )
a
Example
Find the constant c such that the function
cx 2 , 0 x 3
f ( x)
0 otherwise
is a density function.
Compute P 1 X 2
x n f X ( x) if X is a continuous r.v.
Moment about the mean
(central moment)
xk X n p X ( xk ) if X is a discrete r.v.
n E X X
n k
/
x X n f X ( x) if X is a continuous r.v.
E X X p
Var X p(1 p)
2
X
Binomial Distribution
A r.v. X is called a binomial r.v. with parameters
(n,p) if its p.m.f. is given by
n k
p X (k ) P X k p 1 p ; k 0,1,
nk
, n and 0 p 1
k
The c.d.f of X is
x
n k
FX ( x) p 1 p
nk
k 0 k
mean and variance
The mean and variance of the Binomial r.v. are
E X X np
Var X np(1 p)
2
X
Example
A multiple-choice test with 20 questions has five
possible answers for each question. A completely
unprepared student picks the answers for each
question at random and independently. Suppose
X is the number of questions that the student
answers correctly. Calculate the probability that:
The student gets every answer wrong.
The student gets every answer right.
The student gets 8 right answers.
Continuous Probability Distributions
Uniform Distribution
A r.v. X is called a uniform r.v. over (a,b) if its
p.d.f. is given by 1
; a xb
f X ( x) b a
0 otherwise
ab
E X X
2
b a
2
Var X 2
X
12
Example
Let the continuous random variable X denote the
current measured in a thin copper wire in
milliamperes. Assume that the range of X is 0, 20mA
and assume that the probability density function
of X is
f x 0.05, 0 x 20
1
2
f X ( x) e
2
1
1 x
2
exp ; x , , 0
2 2
The c.d.f of X is
1
x
1 2
FX ( x)
2
exp 2 d
x
1
2
2
exp d
2
mean and variance
The mean and variance of the normal r.v. are
E X X
Var X 2
X
2
Normal Distribution
standard normal distribution
In the case 0 and 2 1, the
normal
distribution is called the standard normal
distribution, its p.d.f
1
x2
1
x
2
( x) e 2
exp ; x
2 2 2
Its c.d.f.
x
1
x
2
( x) ( ) d exp 2 d
2
The standard normal distribution is
symmetrical about x 0 , and hence
( x ) ( x)
( x) 1 ( x)
If X has a normal distribution with
mean and variance , then
2
X
Z has a standard normal
distribution. We often write
X
X N , 2
and Z N (0,1)
Standard Normal Distribution
Central Limit Theorem
Let X 1 , X 2 , , X n be a sequence of n independent
and identically distributed random variables, each
with a finite mean and a finite variance 2 0
Let n
X i n
S n n Xn
Zn i 1
n n
Then n
lim Z n N 0,1
n
or
lim P Z n z lim FZn z z
n n
z
1
z
2
( z ) ( ) d exp 2 d
2
Examples
The age of the subscribers to a newspaper has a
normal distribution with mean 50 years and
standard deviation 5 years. Compute the
percentage of subscribers who are less than 40
years old and the percentage who are between 40
and 60 years old.
Let Z N 0,1 . Find the values of P Z 1 2 and
P Z 2 9
SARVEY, SAMPLING AND
ESTIMATION THEORY
Estimation theory consists of estimation of
population parameters, or briefly parameters
(such as population mean and variance), from the
corresponding sample statistics, or briefly statistics
(such as sample mean and variance).
The general procedure for designing a
questionnaire
choosing the broad topics that will reflect the
theme of the survey
deciding on the mode of response
formulating the questions
pilot testing and making final revisions
1 n
i 1
i
p x1 ; p xn ;
Or
n
L x L x1 , , xn f xi ;
i 1
f x1 ; f xn ;
Maximum likelihood estimator
The maximum likelihood estimator of , is
MLE max L x
It means
d
d
L x 0
Log-likelihood function
The Log-likelihood function is given by
n n
ln L x ln L x1 , , xn ln f xi ; ln f xi ;
i 1 i 1
ln f x1 ; ln f xn ;
Or
n n
ln L x ln L x1 , , xn ln p xi ; ln p xi ;
i 1 i 1
ln p x1 ; ln p xn ;
So
d
MLE ( ) is s.t ln L( | x) 0
d
Example
Let X be a Bernoulli r.v. The p.m.f. is
p x 1 p 1 x if x 0,1
f ( x; p )
0 otherwise
k E X k
, k 1, 2,
th
The corresponding k sample moment is
n
1
mk X i , k 1, 2,
k
n i 1
Example
Suppose thatX1 , , X nis a random sample from a
normal distribution with parameter and 2.
For a r.v. X from the normal distribution, the first
and second population moments are respectively
1 E X and 2 E X 2 2 2. n
1
The corresponding sample moments are m1 X i
1 n n i 1
and m2 X i 2
n i 1
Interval Estimation
In practice, estimates are often given in the form
of the estimate plus or minus a certain amount
s s
I X t , X t
, n 1
2 n 2
, n 1 n
Example
Measurements of the diameters of a random
sample of 200 ball bearings made by a
certain machine during one week showed a
mean of 0.824 inch and a standard deviation
of 0.042 inch. Find 95% confidence limits
for the mean diameter of all the ball
bearings.
Confidence Interval on the Variance of
a Normal Distribution
Mean known
n 2 n 2
I 2 , 2
1 n n
Mean unknown
ns 2 ns 2
I 2 , 2
1 n 1 n 1
Confidence Interval on the Proportion
p 1 p p 1 p
I p z , p z
2 n 2 n
Example
A sample poll of 100 voters chosen
at random from all voters in a given
district indicated that 55% of them
were in favor of a particular
candidate. Find 99% confidence
limits for the proportion of all the
voters in favor of this candidate.
TESTS OF HYPOTHESES FOR A
SIMPLE SAMPLE
many problems in practice require that we decide
whether to accept or reject a statement about
some parameter
the statement is called a hypothesis, and the
decision-making procedure about the hypothesis
is called hypothesis testing
Definitions
Statistical hypothesis: statement about a
population parameter
Null hypothesis and Alternative
hypothesis: two complementary hypotheses in a
hypothesis testing problem are called the null
hypothesis and the alternative hypothesis
Hypothesis testing procedure : a rule that
specifies:
I. For which sample values the decision is made to
accept null hypothesis as true.
II. For which sample values null hypothesis is
rejected and alternative hypothesis is accepted
as true.
Definitions(cont)
Test Errors
Decision
Fail to reject H0 Reject H0
Trut H0
h is true Correct decision Type I error
H0
is false Type II error Correct decision
Definitions(cont)
The probability of making a Type I error:
H1 : 1
General Procedure for Hypothesis Tests
From the problem context, identify the parameter of
interest,
State the null hypothesis, H
0
Specify an appropriate alternative hypothesis, H1
Choose a significance level,
Determine an appropriate test statistic.
State the rejection region for the statistic
.
Compute any necessary sample quantities, substitute
these into the equation for the test statistic, and compute
that value.
Decide whether or not H should be rejected and report
0
that in the problem context.
Tests About the Mean of a Normal
Distribution-Variance 2 known
sample mean X is an unbiased point estimator of
2 2
with variance n
i.e. X N ,
n
we wish to test the hypothesis
H 0 : 0
H1 : 0
standardize the sample mean and use a test
statistic based on the standard normal
distribution X 0
z0
n
Tests About the Mean of a Normal
Distribution-Variance 2 known(cont)
The acceptance region for H 0 is given by the
following inequalities: z z0 z
2 2
or equivalently, we reject H if 0
X 0 z or X 0 z
2 n 2 n
Or if X 0 z , 0 z
2 n 2 n
Test procedure
State H 0 and H1
x 0
Determine a test statistic and its value z0
Determine a critical value for n
H1 : 0 H1 : 0
H1 : 0
Example
As a chemist working for a battery manufacturer, you are
given the problem of developing an improved battery for
a calculator that will last “significantly longer” than the
current battery.You know that the measures of the
current battery´s life in the calculator are normally
distributed with 100.3 min and 6.25 min .You
develop an improved battery that theoretically should
last longer, and from preliminary tests you decide that
you can assume its lifetime measures are also normally
distributed with 6.25 min
To do a test of H 0 : 100.3 min ,you take a sample of
n=15 lifetimes of the improved battery in the calculator
and find that x 105.6 min. Does your new battery
prove that it is different than the current one at
0.01 ?
Tests About the Mean of a Normal
Distribution-Variance 2 unknown
The appropriate test statistic is the t statistic
X 0 X 0
t
SX S
n
The acceptance region is given by
t , n1 t0 t , n1
2 2
we reject H 0 if
s s
X 0 t , n1 or X 0 t , n1
2 n 2 n
or equivalently, we reject H 0 if
s s
X 0 t , n1 , 0 t , n1
2 n 2 n
Test procedure ( test)
State H 0 and H1
Determine a test statistic and its value
x 0
t0 t (n 1)
s
n
Determine a critical value for : t , n1 for two-
2
t
sided test and , n1 for one-sided test
Make a conclusion. Reject H 0 if t0 t for two-
, n 1
H1 : 0
H1 : 0
H1 : 0
Example
A sample ( n 20, x 4.0, s 0.83 ) is taken from a
normally distributed population that has a
unknown and unknown . Test the
hypothesis H 0 : 3.6 versus H1 : 3.6 at
0.05
Tests About the Population Proportion p
P p P p0
Z
p p0 1 p0
n
the acceptance region is given by z z0 z
The critical region or rejection region is defined
2 2
by
z0 z or z0 z
2 2
Tests about the Proportion(ctd)
we reject H 0 : p p0 if
p0 1 p0 p0 1 p0
p p0 z or p p0 z
2 n 2 n
or equivalently, if
p0 1 p0 p0 1 p0
p p0 z , p0 z
2 n 2 n
Test procedure ( test)
State H 0 and H1
p p0
Determine a test statistic and its value z0
p0 1 p0
Determine a critical value for n
z for two-sided test; z for one-sided test
2
Make a conclusion. Reject H 0 if
z0 z for two-sided test, z0 z for upper-
sided test
2
and z0 z for lower-sided test
Example
The chairman of a university department takes a
random sample of 75 of the 1727 students, and
asks each of them: Which of courses offered by
the department should be retained? He decided
that if significantly fewer than 20% of the
students want the course, then it will be
eliminated. If the result for Mathematics is that
11 of 75 want it retained, then determine its fate
by testing H 0 : p 0.20 versus H1 : p 0.20 at
0.01