Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13
Sri Sathya Sai University of Human Excellence, Kalaburgi (Karnataka)
6 Days Workshop on Basic Data Analytics for Data Analytics using Excel & PSPP (02/09/2021 - 07/09/2021)
Normal Distribution &
Confidence Interval Dr. Rohit Kanda Frequency Distribution • Marketing researchers often need to answer questions about a single variable. For example: – How many users of the brand may be characterized as brand loyal? – What percentage of the market consists of heavy users, medium users, light users, and non-users? – How many customers are very familiar with a new product offering? How many are familiar, somewhat familiar, and unfamiliar with the brand? What is the mean familiarity rating? Is there much variance in the extent to which customers are familiar with the new product? – What is the income distribution of brand users? Is this distribution skewed toward low-income brackets? • The answer to these kind of questions can be determined by examining frequency distributions. In an frequency distribution, one variable is considered at a time. The objective is to obtain a count of the number of responses associated with different values of the variable. The relative occurrence, or frequency, of different values of the variable is then expressed in percentages. A frequency distribution for a variable produces a table of frequency counts, percentages and cumulative percentages for Example. Frequency Distribution of Familiarity Value Label Value Frequency (n) Percentage Valid Percentage Cum. %age V. Unfamiliar 1 0 0.0 0.0 0.0 2 2 6.7 6.9 6.9 3 6 20.0 20.7 27.6 4 6 20.0 20.7 48.3 5 3 10.0 10.3 58.6 6 8 26.7 27.6 86.2 V. Familiar 7 4 13.3 13.8 100.0 Missing 9 1 3.3 Total 30 100.0 100.0 Familiarity 9 8 7 6 5 Familiarity 4 3 2 1 0 1 2 3 4 5 6 7 Normal Distribution Skewed Distribution Abnormal Distribution • Skewness: A characteristics of a distribution that assesses its symmetry about the mean. • Kurtosis: A measure of the relative peakedness or flatness of the curve defined by the frequency distribution. Confidence Interval • In Statistics, a confidence interval is a kind of interval calculation, obtained from the observed data that holds the actual value of the unknown parameter. It is associated with the confidence level that quantifies the confidence level in which the interval estimates the deterministic parameter. Also, we can say, it is based on Standard Normal Distribution, where Z value is the z-score. Here, let us look at the definition, formula, table, and the calculation of the confidence level in detail. Confidence Interval Definition • The confidence level represents the proportion (frequency) of acceptable confidence intervals that contain the true value of the unknown parameter. In other terms, the confidence intervals are evaluated using the given confidence level from an endless number of independent samples. So that the proportion of the range contains the true value of the parameter that will be equal to the confidence level. • Mostly, the confidence level is selected before examining the data. The commonly used confidence level is 95% confidence level. However, other confidence levels are also used, such as 90% and 99% confidence levels. Confidence Interval Formula • The confidence interval is based on the mean and standard deviation. Thus, the formula to find CI is • X̄ ± Zα/2 × [ σ / √n ] • Where • X̄ = Mean • Z = Confidence coefficient • α = Confidence level • σ = Standard deviation • N = sample space • The value after the ± symbol is known as the margin of error. • Note: This interval is only accurate when the population distribution is normal. But, in the case of large samples from other population distributions, the interval is almost accurate by the Central Limit Theorem. Confidence Interval Table Confidence Interval Z Value 80% 1.282 85% 1.440 90% 1.645 95% 1.960 99% 2.576 99.5% 2.807 99.9% 3.291 How to Calculate Confidence Interval? • To calculate the confidence interval, go through the following procedure. • Step 1: Find the number of observations n(sample space), mean X̄, and the standard deviation σ. • Step 2: Decide the confidence interval of your choice. It should be either 95% or 99%. Then find the Z value for the corresponding confidence interval given in the table. • Step 3: Finally, substitute all the values in the formula. • Also, try out: Confidence Interval Calculator Confidence Interval Example • Question: In a tree, there are hundreds of apples. You are randomly choosing 46 apples with a mean of 86 and a standard deviation of 6.2. Determine that the apples are big enough. • Solution: • Given: Mean, X̄ = 86 • Standard deviation, σ = 6.2 • Number of observations, n = 46 • Take the confidence level as 95%. Therefore, the value of z = 1.960 (from the table) • The formula to find the confidence interval is • X̄ ± Zα/2 × [ σ / √n ] • Now, substitute the values in the formula, we get • 86 ± 1.960 × [ 6.2 / √46 ] • 86 ± 1.960 × [ 6.2 / 6.78] • 86 ± 1.960 × 0.914 • 86 ± 1.79 • Here, the margin of error is 1.79 • Therefore, all the hundreds of apples are likely to be between in the range of 84. 21 and 87.79.