0% found this document useful (0 votes)
63 views144 pages

Statistics For Economists - Huye

This document provides an overview of the key topics in statistics for economists, including combinatorial analysis, probability, random variables, probability distributions, sampling theory, estimation theory, and hypothesis testing. It defines fundamental concepts like sample space, events, set operations on events, axioms of probability, and theorems of probability. Examples are provided to illustrate concepts like fundamental counting principle, permutations, combinations, and conditional probability. Formulas for calculating probabilities of events are also presented.

Uploaded by

Christine Kabera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views144 pages

Statistics For Economists - Huye

This document provides an overview of the key topics in statistics for economists, including combinatorial analysis, probability, random variables, probability distributions, sampling theory, estimation theory, and hypothesis testing. It defines fundamental concepts like sample space, events, set operations on events, axioms of probability, and theorems of probability. Examples are provided to illustrate concepts like fundamental counting principle, permutations, combinations, and conditional probability. Formulas for calculating probabilities of events are also presented.

Uploaded by

Christine Kabera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 144

STATISTICS FOR ECONOMISTS

By: Jean Bosco Ndikubwimana


UR 2019/2020_ 0788568333
[email protected]
Course contents
 COMBINATORIAL ANALYSIS
 PROBABILITY
 UNIVARIATE RANDOM VARIABLES
 PROBABILITY DISTRIBUTIONS
 SAMPLING THEORY
 ESTIMATION THEORY
 TESTS OF HYPOTHESES FOR A SIMPLE
SAMPLE
COMBINATORIAL ANALYSIS

 Combinatorics is a branch of mathematics


concerning the study of finite or countable
discrete structure
COMBINATORIAL ANALYSIS
 Fundamental Principle of Counting
If one thing can be accomplished n1 different
ways and after this a second thing can be
accomplished n2 different ways,... , and finally a
kth thing can be accomplished in nk different
ways, then all things can be accomplished in the
specified order in n1n2...nk different ways
• Example: If a man has 2 shirts and 4 ties, then
he has 8 ways of choosing a shirt and then a tie
Arrangements without Repetition
 Given n distinct objects and wish to arrange r of
these objects in a line. Since there are n ways of
choosing the first object, and after this is done,
n-1 ways of choosing the second object,… and
finally n-r+1 ways of choosing the rth object
 The number of different Arrangements is given by
A
n r  n   n  1   n  r  1
 Or
n!
n Ar 
 n  r !
Example

 The number of different arrangements consisting


of 3 letters each that can be formed from the 7
letters : A,B,C,D,E,F,G is
7!
7 A3   7  6  5  210
 7  3!
Arrangements with Repetition
 An ordering of r objects selected from n
different ones, where any of the elements can be
selected arbitrarily many times, is called an
arrangement with repetition:

n r n r
Permutations without Repetition
 When r=n , the number of different Arrangements
becomes

P  n   n  1  1  n !
n n
Permutations with Repetitions
 A set consists of n objects of which n1 are of one
type (i.e. indistinguishable from each other), n2
are of a second type, …, nr are of a rth type;
n  n1  n2   nk
 The number of different permutations of the
objects is
n!
P n1 ,n2 , 
n , nk 
 n1 !   n2 !    nk !
Example
 The number of different permutations of the 10
letters of the word MISSISSIPI is
10!
10 P1,4,4,1 
1!!  4!   4!  1!
Combinations without Repetition
 The total number of combinations of r objects
selected from n different objects without
considering the order of them (also called the
combinations of n things taken r at a time with )
is given by:
n n!
n Cr    
 r  r !  n  r !
 n  n   n  1    n  r  1 n Ar
n Cr     
 Or
r  r! r!
Example
 The number of ways in which 3 cards can be
chosen or selected from a total of 8 different
cards is 8 8!
C    56
 3  3! 8  3!
8 3
Combinations with Repetition
 The number of possibilities to choose r objects
from n different ones, repeating each element
arbitrarily and not considering the order is
 n  r  1  n  r  1!
n  r 1 Cr   
 r  r !  n  1!
Example
 The number of different numbers of 3 digits to
be written from a set of 5 digits is
 7  7!
531 C3 7 C3      35
 3  3! 4!
Binomial Coefficients
n
 n  nk k
 x  y    x y
n

k 0  k 

 n  n  n  n 1  n  n 2 2 n n
  x  x y x y    y
0 1  2 n
Example

4
 4  4 k k  4  4  4  3  4  2 2  4  3  4  4
 x  y    x y    x    x y    x y    x y    y
4

k 0  k   0  1   2  3   4
 x  4x y  6x y  4x y  y
4 3 2 2 3 4
EXERCISES
 1.From a group of 7 men and 6 women, five persons are to
be selected to form a committee so that at least 3 men are
there on the committee. In how many ways can it be done?
 2. In how many different ways can the letters of the word
'LEADING' be arranged in such a way that the vowels always
come together?
 3. In how many different ways can the letters of the word
'CORPORATION' be arranged so that the vowels always
come together?
PROBABILITY
 Sample space: The set of all possible outcomes
of a random experiment is called the sample
space
 Events: any subset of the sample space

 Examples
Set operations on events
A  Bis the event “either A or B or both”. A  B is
called the union of A and B
 A  B is the event “both A and B ”. A  B is
called the intersection of A and B .
 A is the event “not A ”. A is called the
complement of A .
 A  B  A  B is the event “ A but not B ”.
Example
 Referring to the experiment of tossing a coin
twice, let A be the event “at least one head
occurs” and B the event “the second toss results
is a tail”.
Then , A  HT , TH , HH  , B  HT , TT  and so we have
A  B  HT , TH , HH , TT  , A  B  HT  , A  B  TH , HH  ,

A  TT 
The Concept of Probability
 If an event A can occur in h different ways out of
a total of n possible ways, all of which are equally
likely, then the probability of the event is
h
P  A 
n
 If after n repetitions of an experiment, where n is
very large, an event A is observed to occur in h of
these, then the probability of the event is
h
P  A 
n
Example
 If we toss a coin 1000 times and find that it
comes up heads 532 times, we estimate the
probability of a head coming up to be
532
 0.532
1000
Axioms of Probability
For every event A in the class C, P  A   0
For sure or certain event, P     1
For any number of mutually exclusive events,

  
P  Ai    P  Ai 
 i 1  i 1
Theorems
1. If A1  A2 , then

P  A1   P  A2 
and
P  A2  A1   P  A2   P  A1 
2. For every event A , 0  P  A  1

3. For the empty set, P   0

4. If A 
is the complement of A, then, P A  1  P  A 

5. If A  A1  A2   An , where A1 , A2 , , An are
mutually exclusive event, then,
P  A  P  A1   P  A2    P  An 
6. If A and B are two events, then,
P  A  B   P  A  P  B   P  A  B 

7. If A and B are two events, then,



P  A  P  A  B   P A  B 
8. If an event must result in the occurrence of one
of the mutually exclusive events A1 , A2 , , An,
then,
P  A  P  A  A1   P  A  A2    P  A  An 
EXAMPLE
 Suppose events A and B are not mutually exclusive and
we know that P  A  0.20,P  B   0.30 and P  A  B   0.10
,then find p( A), p( B), p( A  B)andP( A  B)
Example
 Suppose events A and B are not mutually
exclusive and we know that P  A  0.20, P  B   0.30
and P  A  B   0.10,then


P A  1  P  A   0.80 
P B  1  P  B   0.70
P  A  B   P  A  P  B   P  A  B   0.20  0.30  0.10  0.40

   
P A  B  P A  B  1  P  A  B   1   P  A   P  B   P  A  B   1  0.40  0.60
Conditional Probability
 The probability of B given that A has occurred is
P  A  B
P  B A 
P  A
 Similarly, the probability of A given that B has
occurred is P  A  B
P  A B 
P  B

 From those equations we get the Bayes´ thm

P  A  P  B A
P  A B 
P  B
Example
 Find the probability that a single toss of a die will
result in a number less than 4 if it is given that
the toss resulted in an odd number.
Theorems
1. For any three events A1 , A2 , A3 we have

P  A1  A2  A3   P  A1   P  A2 A1   P  A3 A1  A2 

2. If an event must result in one of the mutually


exclusive events A1 , A2 , , An , then
n
P  A   P  Ai   P  A Ai 
i 1
Total Probability
 Let B be any event in Ω. Then
n n
P  B    P  B  Ai    P  Ai   P  B Ai 
i 1 i 1
This gives the generalized Bayes´ Theorem

Let A , A , , A be mutually exclusive and exhaustive


1 2 n

events. Then if B is any event in Ω, we obtain


P  Ai   P  B Ai 
P  Ai B  
 P A  P B A 
n

j j
j 1
EXAMPLE
 An urn B1 contains 2 white and 3 black balls and another urn
B2 contains 3 white and 4 black balls. One urn is selected at
random and a ball is drawn from it. If the ball drawn is found
black, find the probability that the urn chosen was B1
Independents Events
 Two events A and B are said to be independent if
and only if
P  A  B   P  A  P  B 
 If A and B are independent
P  A B   P  A and P  B A   P  B 
Three events are independent if and
only if :

1.P  A  B  C   P  A  P  B   P  C 
2.P  A  B   P  A  P  B 
3.P  A  C   P  A  P  C 
4.P  B  C   P  B   P  C 
UNIVARIATE RANDOMVARIABLES
 A random variable X is a function, which assigns
to each sample point   a real number, called
the value of X ( ) at  
Example
 Suppose that a coin is tossed twice so that the
sample space is   HH , HT , TH , TT 
Let X represents the number of heads that can
come up
Sample Point   HH HT TH TT
X 2 1 1 0
The probability distribution function
PX  B   P  X  B 
 A r.v. is said to be of the discrete type (or just
discrete) if there are countable points x1 , x2 ,
such that 1.PX  xi   0, i  1

2. PX  x     P  X  x   1
i i
i i
The probability distribution
function(cont)
 The function f X

 f X  xi   PX  xi   P  X  xi  for x  xi


 f X  xi   0
 otherwise
has the properties:
 f X  x   0 x

 f x  1
i
X i

and
P  X  B    f X  xi 
xi B
The function defined on I 
satisfies the following properties:

 f X  x   0 x  I

 P  X  x   f  x  dx  1, J  I



J
X

The function f X is called the probability density


function (p.d.f.) of the r.v. X
Events Defined by Random
Variables
 For fixed numbers x, x1 and x2 we can define the
following events:

 X  x    : X ( )  x

 X  x    : X ( )  x

 x1  X  x2    : x1  X ( )  x2 

 These events have the probabilities:


 P  X  x   P  : X ( )  x

 P  X  x   P  : X ( )  x

 P  X  x   P  : X ( )  x
 P x  X  x  P  : x  X ( )  x
  1 2  1 2
Example
 In the experiment of tossing a coin three times,
the sample space consists of eight equally sample
points:
  HHH , HHT , HTH , THH , HTT , THT , TTH , TTT 
If X is the r.v. giving the number of heads obtained,
find:
P  X  2

P  X  2
Distribution Functions
 The distribution function (or cumulative
distribution function c.d.f.) of X is the function
defined by
FX ( x)  P  X  x  ,    x  
 Properties
lim FX ( x)  0
0  FX ( x)  1 x 

FX ( x1 )  FX ( x2 ) if x1  x2 lim FX ( x)  1
x 

lim FX ( x)  FX (a  )  FX (a)
x a
Example
 Consider the r.v. X defined in the above Example
Find and sketch the c.d.f.FX ( x) of X.

x X x FX ( x)
-1  0
1
TTT 
0 8

1 TTT , TTH , THT , HTT  4 1



8 2
2 TTT , TTH , THT , HTT , HHT , HTH , THH  7
8

3  1
4 
1
Determination of Probabilities from the
Distribution Function

P  a  X  b   FX (b)  FX (a)

P  X  a   1  FX (a)

P  X  b   FX (b )

Discrete Random Variables
Probability Mass Functions
 The probability mass function or probability
density function (p.d.f.) or probability
distribution is given by

p X ( x)  P  X  x   f ( x)
Properties
0  pX ( xk )  1, k  1, 2,
 pX ( xk )  1
k

FX ( x)  P  X  x   p
xk  x
X ( xk )
 the distribution function is given by
0    x  x1
 p (x ) x1  x  x2
 X 1
F ( x)   p X ( x1 )  p X ( x2 ) x2  x  x3


 p X ( x1 )  p X ( x2 )   p X ( xn ) xn  x  
Example
 Suppose that a coin is tossed twice. Let X
represent the number of heads that come up.
What is the probability function and the c.d.f
The distribution function is
x 0 1 2
1 1 1
p X ( x)
4 2 4
0   x  0

 p X (0)  1 0  x 1
 4
FX ( x)  
 p (0)  p (1)  3 1 x  2
 X X
4
 p (0)  p (1)  p (2)  1 2 x
 X X X

And is sketched
Continuous Random Variables
Probability Density Function
 the probability density function (p.d.f.) of the
continuous r.v. X is given by
dFX ( x)
fX  x 
dx
Properties
fX  x  0

 f X  x  dx  1


 This function is piecewise continuous


b
P  a  X  b    f X  x  dx
a
The cdf of a continuous r.v. X can be
obtained by
x
FX ( x)  P  X  x    f  t  dt
X


and

P a  X  b  P a  X  b  P a  X  b  P a  X  b
b
  f X  x  dx  FX (b)  FX (a )
a
Example
 Find the constant c such that the function
cx 2 , 0  x  3
f ( x)  
0 otherwise
is a density function.
 Compute P 1  X  2 

 Find the distribution function for the random


variable f ( x)
Mean and Variance of Random
Variables
 xk  p X ( xk ) if X is a discrete r.v.
 k
X  E  X    
  x  f X ( x) dx if X is a continuous r.v.

 Moment about the origin
 xkn  p X ( xk ) if X is a discrete r.v.
 k
n  E  X    
n

  x n  f X ( x) if X is a continuous r.v.

Moment about the mean
(central moment)

  xk   X n  p X ( xk ) if X is a discrete r.v.


 n  E  X   X     
n k
/
 
   x   X n  f X ( x) if X is a continuous r.v.

 

 the variance of X, or Var  X  is the 2nd central


2
X
moment of X
  Var  X   E  X   X  

2 2
X  
Examples
1.Calculate the mean and standard deviation of a
r.v. X defined as follows :
xi 2 1 0 1 2
1 1 1 1 3
pi
8 4 5 8 10
1  
 cos x if   x 
2. f ( x)   2 2 2
0 otherwise
PROBABILITY DISTRIBUTIONS
 Discrete Probability Distributions
Bernoulli Distribution
A r.v. X is called a Bernoulli r.v. with
parameter p if its p.m.f. is given by
p X (k )  P  X  k   p 1  p  ; k  0,1 and 0  p  1
k 1 k

The c.d.f of the Bernoulli r.v.X is given by


0 x0

FX ( x)  1  p 0  x 1
1 x 1

mean and variance
 The mean and variance of the Bernoulli r.v. X are

E  X   X  p

Var  X     p(1  p)
2
X
Binomial Distribution
 A r.v. X is called a binomial r.v. with parameters
(n,p) if its p.m.f. is given by
n k
p X (k )  P  X  k     p 1  p  ; k  0,1,
nk
, n and 0  p  1
k 

 The c.d.f of X is
x
n k
FX ( x)     p 1  p 
nk

k 0  k 
mean and variance
 The mean and variance of the Binomial r.v. are

E  X    X  np

Var  X     np(1  p)
2
X
Example
A multiple-choice test with 20 questions has five
possible answers for each question. A completely
unprepared student picks the answers for each
question at random and independently. Suppose
X is the number of questions that the student
answers correctly. Calculate the probability that:
 The student gets every answer wrong.
 The student gets every answer right.
 The student gets 8 right answers.
Continuous Probability Distributions
 Uniform Distribution
A r.v. X is called a uniform r.v. over (a,b) if its
p.d.f. is given by  1
 ; a xb
f X ( x)   b  a
0 otherwise

The c.d.f of X is 0 xa


xa

FX ( x)   a xb
b  a
1 xb
mean and variance
 The mean and variance of the Uniform r.v. are

ab
E  X   X 
2

 b  a
2

Var  X     2
X
12
Example
Let the continuous random variable X denote the
current measured in a thin copper wire in
milliamperes. Assume that the range of X is 0, 20mA 
and assume that the probability density function
of X is
f  x   0.05, 0  x  20

 What is the probability that a measurement of


current is between 5 and 10 mA?
 What are the mean and standard deviation of X ?
Normal Distribution
 A r.v. X is called a normal (or Gaussian) r.v. if its
p.d.f. is given by
1  x 
2

1   
2  
f X ( x)  e
2
1 
 1 x  
2

 exp     ;    x  ,     ,   0
2  2   
 
 The c.d.f of X is
1
x
 1      2 
FX ( x) 
2
 exp  2     d
 
x

1 
 2 

2


exp   d 
 2
mean and variance
 The mean and variance of the normal r.v. are

E  X   X  

Var  X      2
X
2
Normal Distribution
standard normal distribution
 In the case   0 and  2  1, the
normal
distribution is called the standard normal
distribution, its p.d.f
1 
x2
1 
 x 
2
 ( x)  e 2
 exp   ;    x  
2 2  2 
 Its c.d.f.
x
1
x
 2 
( x)    ( ) d   exp  2  d
 2
The standard normal distribution is
symmetrical about x  0 , and hence

 (  x )   ( x)

( x)  1  ( x)
If X has a normal distribution with
mean  and variance  , then
2

X 
Z has a standard normal

distribution. We often write
X 
X N   , 2
 and Z  N (0,1)

Standard Normal Distribution
Central Limit Theorem
 Let X 1 , X 2 , , X n be a sequence of n independent
and identically distributed random variables, each
with a finite mean  and a finite variance  2  0
Let n

X i  n
S n  n Xn 
Zn  i 1
 
 n  n 
Then n
lim Z n  N  0,1
n 
or
lim P  Z n  z   lim FZn  z     z 
n  n 
z
1
z
 2 
( z )    ( ) d   exp  2  d
 2
Examples
 The age of the subscribers to a newspaper has a
normal distribution with mean 50 years and
standard deviation 5 years. Compute the
percentage of subscribers who are less than 40
years old and the percentage who are between 40
and 60 years old.
 Let Z N  0,1 . Find the values of P  Z  1  2  and
P  Z 2  9
SARVEY, SAMPLING AND
ESTIMATION THEORY
Estimation theory consists of estimation of
population parameters, or briefly parameters
(such as population mean and variance), from the
corresponding sample statistics, or briefly statistics
(such as sample mean and variance).
The general procedure for designing a
questionnaire
 choosing the broad topics that will reflect the
theme of the survey
 deciding on the mode of response
 formulating the questions
 pilot testing and making final revisions

In both the Interview and Self –administered


methods of data capture, a questionnaire is used
Qualities of questionnaire
 Clarity
 Relevance
 Unbiasedness
 moderated length of questions(not too long)
 Confidentiality(Anonymous)
 Logical sequence of questions
 Reading from the available Documents: If data collectors
use the above three methods, they are getting first-hand
. data. Such data is referred to as primary data, and the sources
from which it is obtained are known as primary sources.
However, there are several limitations to primary data e.g.
the costs involved. In most cases we rely on secondary data i.e.
Facts/data collected and documented by others. Secondary
data will usually be in form of documents e.g. financial reports,
business/economic reports, abstracts, Journal etc. Secondary data
can be divided into two categories: Internal sources and
External sources.
ERROR IN DATA ACQUISITION
 Error is any discrepancy between the actual
result obtained and the correct result that would be
provided by an ideal procedure.
 Propagated Error; this is error obtained in the
process of calculation either by approximation or the
use of wrong procedure.
 Typographical Error; Error due to typing or human
error in writing the result obtained from the process.
 Error due to incomplete record or data.
Data Generation Methods
 Observation : use of human or animal sensing organs;
modern remote sensing techniques using cameras
 Experimental design: subject statistical units to treatments
and record the results
 Surveys and census: collect information from selected
statistical units of a population
Types of surveys
 Cross-sectional study : collect information on statistical units
at a specific time

 Longitudinal studies: studies which last for a certain period


of time when the statistical units are followed up (cohort
studies or retrospective studies)
Why use a sample?
 Cost: Cheaper than a census
 Speed: results are available faster
 Accuracy: limited non-random errors
 Destruction of test units: for studies which
require the destruction of statistical units, surveys
are the only way to do a study.
Phase I: Sampling design
 Definition of target population
 Development of a sampling frame
 Decision about the sample selection
 Probability or Nonprobability sampling
 Sampling Unit
 Random sampling error (chance fluctuations)
 Non-sampling error (design errors)
 Sample size
Phase II: Data collection
 Sample selection according to the sampling design
 Elaboration of data collection instruments
 Recruitment and training of enumerators
 Pre-test of data collection instruments
 Field data collection
Phase III: Data analysis and
dissemination
 Data entry mask or matrix
 Data entry and data processing
 Data analysis
 Descriptive statistics
 Estimations and tests of hypothesis
 Statistical modeling
 Reporting and dissemination
 Background, methodology, results, conclusion and
recommendations
METHOD OF SAMPLING
 Simple random sampling
Suppose a population consist of N sampling units
and you are required a sample of n of those units.
A sample of size n is called a simple random sample
if all possible samples of size n are equally likely
to be selected
Cont’d
 Advantages of Simple random sampling:
1. One of the great advantages of simple random
sampling method is that it needs only a minimum
knowledge of the study group of population in
advance.
2. It is free from errors in classification.
3. Simple random sampling is representative of the
population
Cont’d
4. It is totally free from bias and injustice
5.The method is simple to use.
6. It is very easy to evaluate the sampling error in
this method.
Cont’d
 Disadvantages of Simple random sampling
1. This method carries larger errors from the same sample
size than that are found in stratified sampling.
2. In simple random sampling, the selection of sample
becomes impossible if the units or items are widely
dispersed.
3. One of the major disadvantages of simple random
sampling method is that it cannot be employed where
the units of the population are heterogeneous in nature.
Cont’d
4. This method lacks the use of available knowledge
concerning the population.

5. It may be impossible to contact the cases which


are very widely dispersed.
Cont’d
 Systematic sampling
 If an alternative procedure is to list the population in
some order, for example alphabetically or in order of
completion line and then choose every kth member from
the list after obtaining a random starting point.
Cont’d
 Main Advantages
 Systematic samples are relatively easy to construct, execute,
compare and understand. This is particularly important for studies
or surveys that operate with tight budget constraints.
 Main Disadvantages
 The systematic method assumes that the size of the population is
available or can be reasonably approximated. For instance, suppose
a researcher wants to study the size of rats in a given area. If he
doesn't have any idea how many rats there are, he cannot
systematically select a starting point or interval size.
Cont’d
 Stratified sampling: is used when the population is split into
groups that are quite different from one each other and
which together cover the whole population
 Advantages
1.biggest advantage of stratified random sampling is that it
reduces selection bias.
2. it ensures each subgroup within the population receives
proper representation within the sample
Cont’d
 Disadvantages
1.Unfortunately, stratified random sampling cannot be
used in every study. The method's disadvantage is that
several conditions must be met for it to be used properly.
Researchers must identify every member of a population
being studied and classify each of them into one, and
only one, subpopulation
2.The other challenge is accurately sorting each member
of the population into a single stratum.
Cont’d
 Definition: Cluster sampling studies a cluster
of the relevant population. It is a design in which
the unit of sampling consists of multiple cases e.g.
a family, a class room, a school or even a city or a
school system. Cluster sampling is also known as
area sampling. Some authors consider it
synonymous with multistage sampling. In the
multistage sampling, the cases to be studied are
picked up randomly at different stages.
Cont’d
 Cluster sampling offers the following advantages:
 Cluster sampling is less expensive and more
quick. It is more economical to observe clusters
of units in a population than randomly selected
units scattered over throughout the state.
 Cluster Sample permits each accumulation of
large samples.
Cont’d
 The loss of precision per individual case is more
than compensated for by the possibility of
studying larger samples for the same cost.
 Cluster sample may combine the advantages of
both random sampling as well as stratified
sampling.
 Cluster sampling procedure enables to obtain
information from one or more areas.
Cont’d
 The following are the disadvantages of Cluster
sampling:
 In a cluster sample, each cluster may be
composed of units that is like one another. This
may produce large sampling error and reduce the
representativeness of the sample.
 In Cluster sampling, when unequal size of some
of the subsets is selected, an element of sample
bias will arise.
Cont’d
 This type of sampling may not be possible to
apply its findings to another area.
 Sometimes, adequate number of cases from the
stand point of increasing the precision of sample
is not selected, an overlapping effect may take
place.
Cont’d
 Quota sampling is a type of non-probability
sampling that involves a two-step process:
Step 1: Specify a list of relevant control categories
or quotas such as age, gender, income, or
education.
Step 2: Collect a sample that has the same
properties as the target population
Cont’d
 Advantages of Quota Sampling
 It is a useful technique to use in the preliminary
stages of research.
 It is easily administered.
 It allows the researcher to easily compare groups.
 It is useful when detailed accuracy is not
important.
 It can be used to obtain representative samples at
a relatively low cost.
Point Estimation
 Method of Maximum Likelihood
Likelihood function
Let X be a r.v. with p.d.f f  x; or p.m.f. p  x; 
Let x1 , , xn be the observed values in a
random sample . Suppose that the X i ´s, i  1, , n
are independent and identically distributed
(i.i.d.). Then the likelihood function is
defined by L  x   L  x , , x   p  x ; 
n

1 n 
i 1
i

 p  x1 ;    p  xn ; 
 Or

 
n
L  x  L  x1 , , xn    f  xi ; 
i 1

 f  x1 ;    f  xn ; 
Maximum likelihood estimator
 The maximum likelihood estimator of  , is

 
MLE    max L  x

 It means
d
d
 
L  x 0
Log-likelihood function
 The Log-likelihood function is given by

 
n n
ln L  x  ln L  x1 , , xn   ln  f  xi ;    ln f  xi ; 
i 1 i 1

 ln f  x1 ;    ln f  xn ; 
Or
 
n n
ln L  x  ln L  x1 , , xn   ln  p  xi ;    ln p  xi ; 
i 1 i 1

 ln p  x1 ;    ln p  xn ; 
So
d
MLE ( ) is  s.t ln L( | x)  0
d
Example
 Let X be a Bernoulli r.v. The p.m.f. is
 p x 1  p 1 x if x  0,1
f ( x; p )  
0 otherwise

where p is the parameter to be estimated


Method of Moments
 The general idea behind the method of moments
is to equate population moments, which are
defined in terms of expected values, to the
corresponding sample moments. The
population moments will be functions of the
unknown parameters. Then these equations are
solved to yield estimators of the unknown
parameters
Method of Moments(cont)
th
 The k population moment is

k  E  X k
 , k  1, 2,
th
 The corresponding k sample moment is

n
1
mk   X i , k  1, 2,
k

n i 1
Example
 Suppose thatX1 , , X nis a random sample from a
normal distribution with parameter  and  2.
For a r.v. X from the normal distribution, the first
and second population moments are respectively
 
1  E  X    and 2  E X 2   2   2. n
1
The corresponding sample moments are m1   X i
1 n n i 1
and m2   X i 2

n i 1
Interval Estimation
 In practice, estimates are often given in the form
of the estimate plus or minus a certain amount

 one is quite accustomed to giving estimates in the


form of intervals
General Method to Derive a Confidence
Interval
 The shortest confidence interval for with 
confidence coefficient 1   is given by
   
 X  z  , X  z  
 2 n 2 n

where z is the upper quantile of the distribution


2
Confidence Interval on the Mean of a
Normal Distribution
 Variance known
   
I   X  z  , X  z  
 2 n
 Variance unknown
2 n

 s s 
I   X  t  , X  t  
, n 1 
 2 n 2
, n 1 n
Example
 Measurements of the diameters of a random
sample of 200 ball bearings made by a
certain machine during one week showed a
mean of 0.824 inch and a standard deviation
of 0.042 inch. Find 95% confidence limits
for the mean diameter of all the ball
bearings.
Confidence Interval on the Variance of
a Normal Distribution
 Mean known
 n 2 n 2 
I  2 , 2 
 1  n    n  
 Mean unknown

 ns 2 ns 2 
I  2 , 2 
 1  n  1   n  1 
Confidence Interval on the Proportion

 p 1  p  p 1  p  
I   p  z  , p  z  
 2 n 2 n 
Example
A sample poll of 100 voters chosen
at random from all voters in a given
district indicated that 55% of them
were in favor of a particular
candidate. Find 99% confidence
limits for the proportion of all the
voters in favor of this candidate.
TESTS OF HYPOTHESES FOR A
SIMPLE SAMPLE
 many problems in practice require that we decide
whether to accept or reject a statement about
some parameter
 the statement is called a hypothesis, and the
decision-making procedure about the hypothesis
is called hypothesis testing
Definitions
 Statistical hypothesis: statement about a
population parameter
 Null hypothesis and Alternative
hypothesis: two complementary hypotheses in a
hypothesis testing problem are called the null
hypothesis and the alternative hypothesis
 Hypothesis testing procedure : a rule that
specifies:
I. For which sample values the decision is made to
accept null hypothesis as true.
II. For which sample values null hypothesis is
rejected and alternative hypothesis is accepted
as true.
Definitions(cont)
 Test Errors

Decision
Fail to reject H0 Reject H0
Trut H0
h is true Correct decision Type I error

H0
is false Type II error Correct decision
Definitions(cont)
 The probability of making a Type I error:

  P  Type I error   P  reject H 0 H 0 is true 

 The probability of making a Type II error:

  P  Type II error   P  fail to reject H 0 H 0 is false 


example
 A random variable has a normal distribution with mean and
standard deviation 3. The null hypothesis µ=20 is to be tested
against the alternative hypothesis using a random sample of
size 25. It decide that the null hypothesis will be rejected if
the sample mean is greater than 21.4
a) Calculate the probability of making a type I error
b) Calculate the probability of making a type II error, when in
fact
Definitions(cont)
 Power of a statistical test: indicates the
probability of rejecting H 0 when H 0 is false
power  1    P  reject H 0 H 0 is false 
 1  P  fail to reject H 0 H 0 is false 
 Rejection region and Acceptance region:
the subset of the sample space for which null
hypothesis will be rejected is called the rejection
region or critical region. The complement of the
rejection region is called the acceptance region
Null Hypothesis and Alternative
Hypothesis
 In every hypothesis testing problem, there is a
pair of competing statistical hypotheses: the null
hypothesis H 0 :   0for unknown population
parameter  and the alternative hypothesis H1
which occurs in one of four general forms:
 Two sided alternative: H1 :   0
 Left directional alternative: H1 :   0
 Right directional alternative: H :   
1 0

H1 :   1
General Procedure for Hypothesis Tests
 From the problem context, identify the parameter of
interest, 
 State the null hypothesis, H
0
 Specify an appropriate alternative hypothesis, H1
 Choose a significance level, 
 Determine an appropriate test statistic.
 State the rejection region for the statistic
.
 Compute any necessary sample quantities, substitute
these into the equation for the test statistic, and compute
that value.
 Decide whether or not H should be rejected and report
0
that in the problem context.
Tests About the Mean  of a Normal
Distribution-Variance  2 known
 sample mean X is an unbiased point estimator of 
2  2 
with variance n
i.e. X N  , 
 n 
 we wish to test the hypothesis
H 0 :   0
H1 :    0
 standardize the sample mean and use a test
statistic based on the standard normal
distribution X  0
z0 

n
Tests About the Mean  of a Normal
Distribution-Variance  2 known(cont)
 The acceptance region for H 0 is given by the
following inequalities:  z  z0  z
 
2 2

 The critical region or rejection region is defined


by z0  z or z0   z
 
2 2

 or equivalently, we reject H if 0

 
X  0  z   or X  0  z  
2 n 2 n

   
 Or if X   0  z   , 0  z   
 2 n 2 n
Test procedure
 State H 0 and H1
x  0
 Determine a test statistic and its value z0 

 Determine a critical value for  n

z  for two-sided test; z for one-sided test.


2

 Make a conclusion. Reject H 0 if


z0  z for two-sided test, z0  z for upper-sided
test and
2
z0   z for lower-sided test.
Test procedure (cont)

H1 :    0 H1 :    0

H1 :    0
Example
 As a chemist working for a battery manufacturer, you are
given the problem of developing an improved battery for
a calculator that will last “significantly longer” than the
current battery.You know that the measures of the
current battery´s life in the calculator are normally
distributed with   100.3 min and  6.25 min .You
develop an improved battery that theoretically should
last longer, and from preliminary tests you decide that
you can assume its lifetime measures are also normally
distributed with   6.25 min
To do a test of H 0 :   100.3 min ,you take a sample of
n=15 lifetimes of the improved battery in the calculator
and find that x  105.6 min. Does your new battery
prove that it is different than the current one at
  0.01 ?
Tests About the Mean  of a Normal
Distribution-Variance  2 unknown
 The appropriate test statistic is the t statistic
X  0 X  0
t 
SX S
n
 The acceptance region is given by
t  , n1  t0  t  , n1
2 2

 The critical region or rejection region is defined


by t0  t  , n1 or t0  t  , n1
2 2
Tests About the Mean  of a Normal
Distribution-Variance  unknown(cnt)
2

 we reject H 0 if
s s
X  0  t  , n1  or X  0  t  , n1 
2 n 2 n
 or equivalently, we reject H 0 if

 s s 
X   0  t  , n1  , 0  t  , n1  
 2 n 2 n 
Test procedure ( test)
 State H 0 and H1
 Determine a test statistic and its value
x  0
t0  t (n  1)
s
n
 Determine a critical value for  : t  , n1 for two-
2
t
sided test and  , n1 for one-sided test
 Make a conclusion. Reject H 0 if t0  t  for two-
, n 1

sided test, t0  t , n1for upper-sided test and t0  t , n1


2

for lower-sided test


Test procedure (cont)

H1 :    0
H1 :    0

H1 :    0
Example
 A sample ( n  20, x  4.0, s  0.83 ) is taken from a
normally distributed population that has a
unknown  and unknown  . Test the
hypothesis H 0 :   3.6 versus H1 :   3.6 at
  0.05
Tests About the Population Proportion p

It is often necessary to test hypotheses on a


population proportion. Consider the situation
where a sample of size n is taken under Bernoulli
trials from a large population that has an
unknown proportion p of success elements
The hypotheses are: H : p  p
0 0
H1 : p  p0 H1 : p  p0 H1 : p  p0
Tests about the Proportion(ctd)
 the test procedure uses the following test statistic
(if np0  5and nq0  n 1  p0   5 ) is:

P  p P  p0
Z 
p p0 1  p0 
n
 the acceptance region is given by  z   z0  z 
 The critical region or rejection region is defined
2 2

by
z0  z  or z0   z 
2 2
Tests about the Proportion(ctd)
 we reject H 0 : p  p0 if

p0 1  p0  p0 1  p0 
p  p0  z   or p  p0  z  
2 n 2 n

 or equivalently, if

 p0 1  p0  p0 1  p0  
p   p0  z   , p0  z   
 2 n 2 n 
Test procedure ( test)
 State H 0 and H1
p  p0
 Determine a test statistic and its value z0 
p0 1  p0 
 Determine a critical value for  n
z  for two-sided test; z for one-sided test
2 
 Make a conclusion. Reject H 0 if
z0  z for two-sided test, z0  z for upper-
sided test
2
and z0   z for lower-sided test
Example
 The chairman of a university department takes a
random sample of 75 of the 1727 students, and
asks each of them: Which of courses offered by
the department should be retained? He decided
that if significantly fewer than 20% of the
students want the course, then it will be
eliminated. If the result for Mathematics is that
11 of 75 want it retained, then determine its fate
by testing H 0 : p  0.20 versus H1 : p  0.20 at
  0.01

You might also like