0% found this document useful (0 votes)
88 views

Chapter 8 - Sampling Distribution

The document discusses fundamental concepts related to sampling distributions and statistics. It defines key terms like population, sample, statistic, and sampling distribution. It explains important statistics like the sample mean, median, mode, variance, and standard deviation. It also covers important probability distributions like the normal, binomial, Poisson, and t-distributions. It describes the central limit theorem and how the distribution of sample means approximates the normal distribution as the sample size increases. Finally, it provides examples of calculating probabilities and t-scores from t-distributions.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

Chapter 8 - Sampling Distribution

The document discusses fundamental concepts related to sampling distributions and statistics. It defines key terms like population, sample, statistic, and sampling distribution. It explains important statistics like the sample mean, median, mode, variance, and standard deviation. It also covers important probability distributions like the normal, binomial, Poisson, and t-distributions. It describes the central limit theorem and how the distribution of sample means approximates the normal distribution as the sample size increases. Finally, it provides examples of calculating probabilities and t-scores from t-distributions.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Chapter 8

Fundamental Sampling
Distributions and Data Descriptions
Populations and Samples

 Population: population refers to the total set of


observations with which we are concerned.
 Sample: A sample is defined as a smaller set of data
drawn from a population.

Samples are used in statistical testing when


population sizes are too large for the test to include
all possible members or observations.
Some Important Statistics

•Statistic
 
we wish to arrive at a conclusion concerning the
proportion of soft-drink drinkers in Bangladesh who
prefer a certain brand – Coca-Cola.
 Many random samples are possible from the same
population.
 P = the proportion of people in that sample favoring
the brand Coca-Cola.
 = value of random variable, P.
Such a random variable is called a statistic.
Cont’d…
A statistic or sample statistic is any random variable
computed from a random sample used for a statistical
purpose.
Statistical purposes include estimating a population
parameter, describing a sample, or evaluating a
hypothesis.
e.g. The average (mean) of sample values is a
statistic.
# A statistic is a characteristic of a sample.
Location Measures of a Sample

•Sample
  mean: The average of a set of data.

 The term sample mean is applied to both the


statistic and its computed value .
 The sample mean can be used to calculate the
central tendency of a data.
Sample median:

 The sample median is also a location measure that


shows the middle value of the sample.

Sample mode:
The sample mode is the value of the sample that
occurs most often.
Variability Measures of a Sample

Sample variance

 The computed value of S2 for a given sample is


denoted by s2.
 The sample variance, s2, is used to calculate how
varied a sample is.
•Sample
  standard deviation:

where S2 is the sample variance.

Sample range:
Let Xmax denote the largest of the Xi values and Xmin
the smallest.
R = Xmax − Xmin.
Example: The grade-point averages of 20 college
seniors selected at random from a graduating class are
as follows:
3.2 1.9 2.7 2.4 2.8
2.9 3.8 3.0 2.5 3.3
1.8 2.5 3.7 2.8 2.0
3.2 2.3 2.1 2.5 1.9
Calculate the standard deviation.
Sampling Distributions

A sampling distribution is a probability distribution


of a statistic obtained from a larger number of
samples drawn from a specific population.
Sampling Distribution of Means

Central Limit Theorem


The central limit theorem states that if you have a
population with mean, μ and standard deviation, σ
and take sufficiently large random samples from the
population with replacement, then the distribution of
the sample means will be approximately normally
distributed.
KEY TAKEAWAYS
 The central limit theorem (CLT) states that the
distribution of sample means approximates a normal
distribution as the sample size gets larger.
 Sample sizes equal to or greater than 30 are
considered sufficient for the CLT to hold.
 A key aspect of CLT is that the average of the
sample means and standard deviations will equal the
population mean and standard deviation.
 A sufficiently large sample size can predict the
characteristics of a population accurately.
Fig: Illustration of the Central Limit Theorem
Probability Distribution
A probability distribution is the mathematical
function that gives the probabilities of occurrence of
different possible outcomes for an experiment.

- Binomial distribution
- Poisson distribution
- Normal distribution etc.
Normal distribution
 Early statisticians noticed the same shape coming
up over and over again in different distributions—
so they named it the normal distribution.

 A normal distribution, sometimes called the bell


curve, is a distribution that occurs naturally in many
situations.
• Heights of people
• Measurement errors
• Blood pressure
• birth weight
• reading ability
• job satisfaction
• Rainfall etc.
The density of the normal random variable X, with
mean μ and variance σ2, is
Probability/ Areas under the
Normal Curve
The area under the curve bounded by the two
ordinates x = x1 and x = x2 equals the probability

Thus, for the normal curve


We are able to transform all the observations of any
normal random variable X into a new set of observations
of a standard normal random variable Z with mean 0 and
variance 1.
This can be done by means of the transformation –

Consequently, we may write

where Z is seen to be a normal random variable with mean


0 and variance 1
# Given a standard normal distribution, find the area under the curve that lies
(a) to the right of z = 1.84 and
(b) between z = −1.97 and z = 0.86.

(a) The area in Figure (a) to the right of z = 1.84 is equal to 1 minus the area in
Table A.3 to the left of z = 1.84, namely, 1 − 0.9671 = 0.0329.
(b) The area in Figure (b) between z = −1.97 and z = 0.86 is equal to the
area to the left of z = 0.86 minus the area to the left of z = −1.97. From
Table A.3 we find the desired area to be 0.8051 − 0.0244 = 0.7807.
# Given a random variable X having a normal distribution
with μ = 50 and σ = 10, find the probability that X assumes a
value between 45 and 62.

Therefore,
P(45 < X < 62) = P(-0.5 < Z < 1.2)
Using Table A.3, we have
P(45 < X < 62) = P(−0.5 < Z < 1.2) = P(Z < 1.2) − P(Z < −0.5)
= 0.8849 − 0.3085 = 0.5764.
t-Distribution

In the previous discussions, it was shown that when


the population is normally distributed, or when the
sample size is large enough, the sampling distribution
of the mean is normally distributed. And of course,
the bell curve is very handy to use.

However, in many cases where we can only


obtain small sizes, the normal distribution
does not hold true. Instead, we use the t
distribution which is the distribution of t-scores.
t-distribution is any member of a family of
continuous probability distribution that arises when
estimating the mean of a normally distributed
population in situations where the sample size is
small and the population standard deviation is
unknown.
KEY TAKEAWAYS
 The t-distribution is used when data are
approximately normally distributed, which means
the data follow a bell shape but the population
variance is unknown.

 The variance in a t-distribution is estimated based


on the degrees of freedom of the data set (total
number of observations minus 1) rather than the
true standard deviation.
Cont’d...
 The T distribution is similar to the normal
distribution, just with only a bit shorter and fatter
tails.
 T distributions have higher kurtosis than normal
distributions.
 The probability of getting values very far from the
mean is larger with a T distribution than a normal
distribution.
(a) The t-distribution curves for v = 2, 5, (b) Symmetry property (about 0) of
and ∞ the t-distribution
EXAMPLE 2
Find the t-score for a sample size of 16 taken from
a population with mean 10 when the sample mean
is 12 and the sample standard deviation is 1.5.

Ans: 1 - 0.05 - 0.10 = 0.85


F-distribution

You might also like