0% found this document useful (0 votes)
5 views

Data Analysis Workshop - Normal Distribution & Confidence Interval

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Data Analysis Workshop - Normal Distribution & Confidence Interval

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Sri Sathya Sai University of Human Excellence, Kalaburgi (Karnataka)

6 Days Workshop on
Basic Data Analytics for Data Analytics using Excel & PSPP
(02/09/2021 - 07/09/2021)

Normal Distribution &


Confidence Interval
Dr. Rohit Kanda
Frequency Distribution
• Marketing researchers often need to answer questions about a single
variable. For example:
– How many users of the brand may be characterized as brand loyal?
– What percentage of the market consists of heavy users, medium users, light users,
and non-users?
– How many customers are very familiar with a new product offering? How many
are familiar, somewhat familiar, and unfamiliar with the brand? What is the mean
familiarity rating? Is there much variance in the extent to which customers are
familiar with the new product?
– What is the income distribution of brand users? Is this distribution skewed toward
low-income brackets?
• The answer to these kind of questions can be determined by examining
frequency distributions. In an frequency distribution, one variable is
considered at a time. The objective is to obtain a count of the number of
responses associated with different values of the variable. The relative
occurrence, or frequency, of different values of the variable is then
expressed in percentages. A frequency distribution for a variable produces
a table of frequency counts, percentages and cumulative percentages for
Example. Frequency Distribution of
Familiarity
Value Label Value Frequency (n) Percentage Valid Percentage Cum. %age
V. Unfamiliar 1 0 0.0 0.0 0.0
2 2 6.7 6.9 6.9
3 6 20.0 20.7 27.6
4 6 20.0 20.7 48.3
5 3 10.0 10.3 58.6
6 8 26.7 27.6 86.2
V. Familiar 7 4 13.3 13.8 100.0
Missing 9 1 3.3
Total 30 100.0 100.0
Familiarity
9
8
7
6
5 Familiarity
4
3
2
1
0
1 2 3 4 5 6 7
Normal Distribution
Skewed Distribution
Abnormal Distribution
• Skewness: A characteristics of a distribution
that assesses its symmetry about the mean.
• Kurtosis: A measure of the relative peakedness
or flatness of the curve defined by the
frequency distribution.
Confidence Interval
• In Statistics, a confidence interval is a kind of
interval calculation, obtained from the observed
data that holds the actual value of the unknown
parameter. It is associated with the confidence
level that quantifies the confidence level in which
the interval estimates the deterministic parameter.
Also, we can say, it is based on Standard Normal
Distribution, where Z value is the z-score. Here, let
us look at the definition, formula, table, and the
calculation of the confidence level in detail.
Confidence Interval Definition
• The confidence level represents the proportion (frequency)
of acceptable confidence intervals that contain the true
value of the unknown parameter. In other terms, the
confidence intervals are evaluated using the given
confidence level from an endless number of independent
samples. So that the proportion of the range contains the
true value of the parameter that will be equal to the
confidence level.
• Mostly, the confidence level is selected before examining
the data. The commonly used confidence level is 95%
confidence level. However, other confidence levels are also
used, such as 90% and 99% confidence levels.
Confidence Interval Formula
• The confidence interval is based on the mean and standard deviation.
Thus, the formula to find CI is
• X̄ ± Zα/2 × [ σ / √n ]
• Where
• X̄ = Mean
• Z = Confidence coefficient
• α = Confidence level
• σ = Standard deviation
• N = sample space
• The value after the ± symbol is known as the margin of error.
• Note: This interval is only accurate when the population distribution is
normal. But, in the case of large samples from other population
distributions, the interval is almost accurate by the Central Limit Theorem.
Confidence Interval Table
Confidence Interval Z Value
80% 1.282
85% 1.440
90% 1.645
95% 1.960
99% 2.576
99.5% 2.807
99.9% 3.291
How to Calculate Confidence Interval?
• To calculate the confidence interval, go through the
following procedure.
• Step 1: Find the number of observations n(sample
space), mean X̄, and the standard deviation σ.
• Step 2: Decide the confidence interval of your choice.
It should be either 95% or 99%. Then find the Z value
for the corresponding confidence interval given in the
table.
• Step 3: Finally, substitute all the values in the formula.
• Also, try out: Confidence Interval Calculator
Confidence Interval Example
• Question: In a tree, there are hundreds of apples. You are randomly choosing 46 apples with
a mean of 86 and a standard deviation of 6.2. Determine that the apples are big enough.
• Solution:
• Given: Mean, X̄ = 86
• Standard deviation, σ = 6.2
• Number of observations, n = 46
• Take the confidence level as 95%. Therefore, the value of z = 1.960 (from the table)
• The formula to find the confidence interval is
• X̄ ± Zα/2 × [ σ / √n ]
• Now, substitute the values in the formula, we get
• 86 ± 1.960 × [ 6.2 / √46 ]
• 86 ± 1.960 × [ 6.2 / 6.78]
• 86 ± 1.960 × 0.914
• 86 ± 1.79
• Here, the margin of error is 1.79
• Therefore, all the hundreds of apples are likely to be between in the range of 84. 21 and 87.79.

You might also like