0% found this document useful (0 votes)
49 views45 pages

Standard Deviation

Standard deviation (SD) is a measure of the dispersion of data points relative to their mean, indicating how spread out values are in a dataset. It is calculated as the square root of variance and is crucial for understanding data variability, with lower SD indicating values close to the mean and higher SD indicating values further away. The document also explains the calculation of SD, variance, and provides examples, emphasizing the importance of these statistical measures in data analysis.

Uploaded by

elysha08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views45 pages

Standard Deviation

Standard deviation (SD) is a measure of the dispersion of data points relative to their mean, indicating how spread out values are in a dataset. It is calculated as the square root of variance and is crucial for understanding data variability, with lower SD indicating values close to the mean and higher SD indicating values further away. The document also explains the calculation of SD, variance, and provides examples, emphasizing the importance of these statistical measures in data analysis.

Uploaded by

elysha08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

6.

Data Analysis and


Interpretation
Standard deviation
What is Standard Deviation?
• Standard deviation is the degree of dispersion or the
scatter of the data points relative to its mean, in
descriptive statistics.
• It tells how the values are spread across the data
sample and it is the measure of the variation of the data
points from the mean.
• The standard deviation of a data set, sample, statistical
population, random variable, or probability distribution
is the square root of its variance.
Standard Deviation
• is the positive square root of the variance. It is one of
the basic methods of statistical analysis.
• is commonly abbreviated as SD and denoted by the
symbol 'σ’ and it tells about how much data values
are deviated from the mean value.
• If we get a low standard deviation, then it means
that the values tend to be close to the mean
whereas a high standard deviation tells us that the
values are far from the mean value.
Remember…

• The standard deviation is a measure of how


spread-out numbers are from the mean
(average) of a data set. It gives you an idea
of how much the values in the data set
differ from the average value.
Exercise
• The following are the History test scores for 6 Bistari.
Find the Mode, Median, Mean, and Standard Deviation
of these scores.
55 60 70 75 65 70
80 65 80 70 75 65
60 55 85 85 60 80
70 45 50 65 80 85
• STEP 1: Arrange the scores in ascending order.
• STEP 2: Prepare a Frequency Table (The frequency
distribution table drawn below is called an ungrouped
frequency distribution table.
Marks obtained in the test Number of students (Frequency)
• STEP 1: Arrange the scores in ascending order.
• STEP 2: Prepare a Frequency Table (Scores/Data Not
Collated)
• Class interval size = Range Range = largest value - smallest value
Class Interval ClassNumber
Boundariesof classes Tally Marks Frequency (f) Cumulative Frequency

Total Frequency Ʃϯ = 24
Standard Deviation
• defines the spread of data values around the mean.
• Here are two standard deviation formulas that are used
to find the standard deviation of sample data and the
standard deviation of the given population.
• answer for SD.xlsx
• Example 1: There are 39 plants in the garden. A few
plants were selected randomly and their heights in cm
were recorded as follows: 51, 38, 79, 46, 57. Calculate
the standard deviation of their heights.
• Solution:
•n=5
• Sample mean x̄ = (51+38+79+46+57)/5 =
• Since, sample data is given, we use the sample SD
formula.
Answer: Standard deviation for this data is 15.5
Example 2:
• In a class of 50, 4 students were selected at random
and their total marks in the final assessments are
recorded, which are: 812, 836, 982, and 769. Find the
variance and standard deviation of their marks.
Solution:

•n=4
• Sample Mean (X̄) = (812+836+982+769)/4 = 849.75
• calculate the sample standard deviation as the given
data is just a sample.
= [(812 - 849.75)2 + (836 - 849.75)2 + (982 - 849.75)2 + (769 -
849.75)2] /3
= 8541.58

Using the SD formula,


SD = √8541.58 = 92.4

Answer: Variance is 8541.58 and standard deviation for


this data is 92.4
How to calculate Standard Deviation, Mean, Variance Statistics

Score (x) Deviation about the mean ( x – x̄ )2


x - x̄

85
77
72
60
50
40
∑=

Min ( Mean ) x̄ = ?
Answer
Score (x) Deviation about the ( x – x̄ )2
mean
x - x̄

85 21 441
Variance (s²)
77 13 169
72 8 64 1462/ 6-1
60 -4 16
50 -14 196 292.4
40 -24 576
384 0 ∑ = 1462
=Q13^0.5
Min ( Mean ) x̄ = 64 17.09971
Variance is the average of the squared deviations from the mean, while
standard deviation is the square root of this number.

Both measures reflect variability in distribution, but their units differ:


Standard deviation is expressed in the same units as the original values
(e.g., minutes or meters).
Sample standard deviation formula = √[ Σ (xi – x̅)2/(n-1) ]

Variance formula = σ2 = Σ (xi – x̅)2/(n-1)


• The SD is usually more
useful to describe the
variability of the data
while the variance is
usually much more
useful mathematically.

• For example, the sum of


uncorrelated
distributions (random
variables) also has a
variance that is the sum
of the variances of those
distributions.
• Imagine every student in a class takes an English test. After grading the
tests, the teacher wants to understand how the students
performed, not just who got the highest or lowest score, but how
all the scores are spread out.
• Standard Deviation (SD) helps here by showing how much the English
scores vary from the average score. If SD is small, it means most
students scored around the same mark as the average; maybe the test
was fair and everyone was well prepared.
• If SD is large, students' scores are all over the place; perhaps the test
was really hard, or maybe not everyone studied the same way.
• Variance is like SD's less straightforward cousin. It also tells us about
how spread out the English scores are, but it does so by first making
the differences from the average bigger (by squaring them). This
makes the variance number hard to understand directly (because it's not
in "English test scores" anymore, but in "squared English test scores"),
but it's very useful for some types of math problems, especially
when comparing the spread of scores between different classes or
tests.
• For example, if we have two
classes and we want to
know which class has more
variation in English scores,
variance can help make
this clear, especially if we're
mixing data or looking at it
in a complex way.

• But to make it easy to


understand, we usually go
back to talking about SD
since it's in terms we all get
– like points on an English
test.
To adjust this, the denominator of the sample
standard deviation is corrected to be n-1 instead of
just n. This is known as Bessel's correction.
Steps to Calculate Standard Deviation
by Actual Mean Method:

1.Collect Your Data: List all the values in your data set.
2.Calculate the Mean:
1. Add all the data values together.
2. Divide the sum by the total number of data points (values). This result is your mean.
3.Find the Deviations from the Mean:
1. Subtract the mean from each data value to find the deviation of each value from the mean.
4.Square Each Deviation:
1. Square each of the deviations found in the previous step. Squaring is done to eliminate
negative values and give more weight to larger deviations.
5.Calculate the Mean of the Squared Deviations:
1. Add all the squared deviations together.
2. Divide this sum by the total number of data points. This gives you the variance.
6.Take the Square Root of the Variance:
1. The final step is to take the square root of the variance. This gives you the standard deviation.
Simple exercise
• Imagine you have data on the number of books 5
students read in a month: 4, 6, 8, 10, and 12 books.
1.Calculate the Mean:
1. (4 + 6 + 8 + 10 + 12) / 5 = 40 / 5 = 8 books
• Find the Deviations from the Mean: ( x – x̄ )
1. 4 - 8 = -4
2. 6 - 8 = -2
3. 8 - 8 = 0
4. 10 - 8 = 2
5. 12 - 8 = 4
• Square Each Deviation:
(-4)^2 = 16 ( x – x̄ )2
• (-2)^2 = 4
• 0^2 = 0
• 2^2 = 4
• 4^2 = 16
1.Calculate the Mean of the Squared Deviations:
1.(16 + 4 + 0 + 4 + 16) / 5 = 40 / 5 = 8
2.Take the Square Root of the Variance:
1.√8 ≈ 2.83
• So, the standard deviation of this data set is
approximately 2.83, indicating that, on average, the
number of books read by the students deviates from the
mean by about 2.83 books.
• This method gives a clear picture of the variability or
dispersion of a data set, helping in understanding the
spread of data points around the mean.
Standard Deviation of Discrete Data by Actual Mean
Method
Score (x) Frequency (f) Deviation about ( x – x̄ )2
the mean
x - x̄

40 2
50 4
60 5
70 7
80 4
90 3
N=25
Score (x) Frequency (f) Deviation about ( x – x̄ )2
the mean
x - x̄

40 2 -26.4 696.96
50 4 -16.4 268.96
60 5 -6.4 40.96
70 7 3.6 12.96
80 4 13.6 184.96
90 3 23.6 556.96
Min ( Mean ) x̄ = 66.4 N = 25
• Standard Deviation by The Actual Mean Method
• In this method, first compute the mean of the data
values( x̄ ) and then compute the deviations of each data
value from the mean. Then use the following standard
deviation formula by actual mean method:
• σ = √(∑(x−x̄ )2 /n), where n = total number of
observations.
• mean of these data 66.4
• The sum of the squared differences from mean 1761.76
• Variance = Squared differences from mean/ number of
data points =1761/25 =70.4704
• Standard deviation = √70.47 =8.394664972
The notation f ( x – x̄ )2 appears to describe a function f
applied to the squared difference between x and x̄ ,
where:
x is a variable representing individual values or data
points within a data set.
x̄ (pronounced "x bar") is the mean (average) of all the
values in the data set.
(x−xˉ)2 is the squared difference between an individual
value and the mean of the data set.
• Squaring this difference is a common step in statistical
calculations, especially when calculating variance and
standard deviation, as it ensures that all differences are
positive and emphasizes larger differences.
• f might not represent a function in the traditional sense
but could instead indicate that you are to perform a
specific operation on each ( x – x̄ )2 term.
• For example, in the context of calculating the variance
of a data set, you would sum all ( x – x̄ )2 values and then
divide by the number of values (or N-1, in the case of
sample variance) to find the average squared deviation,
which is the variance.
• It could simply mean that for each data point, you calculate the
square of its deviation from the mean. This is a crucial step in
calculating the variance σ 2
of a population or a sample, where

σ 2=
∑( x – x̄ )2 for a population, or s 2 = ∑( x – x̄ )2
n n-1

for a sample, with N being the total number of data points.


• Data Point: A student scored 92% on a mathematics exam.
• Context: This score is one data point within a larger dataset of
grades used to evaluate the student's academic performance.
Score (x) Frequency (f) Deviation ( x – x̄ )2 f ( x – x̄ )2
about the
mean
x - x̄

40 2 -26.4 696.96 1393.92


50 4 -16.4 268.96 1075.84
60 5 -6.4 40.96 204.8
70 7 3.6 12.96 907.2
80 4 13.6 184.96 739.84
90 3 23.6 556.96 1670.88
N = 25 ∑ =5992.48

Min ( Mean ) x̄ = 66.4


= √5992.48/25

= √239.6992

σ =15.48222206
Standardized T-score
Score

• A T-score is a standardized score that tells you how many


standard deviations a particular data point is from the mean of a
dataset.
• It's often used when comparing an individual's score to a larger
group or population.
• T-scores are especially common in educational testing and
psychological assessments.
• The T-score formula is:
T = (X - μ) / (σ / √n),
where X is the individual score, μ is the mean of the
population, σ is the standard deviation of the population, and n is
the sample size.
Standardized Z-score
Score

• A Z-score is also a standardized score that measures how


many standard deviations a data point is from the mean of a
dataset.
• It's used to compare individual data points to a standard
normal distribution (mean of 0 and standard deviation of 1).
• Z-scores are commonly used in statistics and research to
analyze and compare data.
• The Z-score formula is:
Z = (X - μ) / σ,
where X is the individual score, μ is the mean of the population,
and σ is the standard deviation of the population.
Why are Z-Scores Important?

It is useful to standardize the values (raw scores) of


a normal distribution by converting them into z-scores
because:
• It allows researchers to calculate the probability of a
score occurring within a standard normal distribution;
• It enables to compare two scores from different
samples (which may have different means and standard
deviations).
The formula for
calculating a z-score

z = (x-μ)/σ

where x is the raw score, μ is the


population mean, and σ is the
population standard deviation.
As the formula shows, the z-score is
simply the raw score minus the
population mean, divided by the
population standard deviation
Example of calculation of Z-score and T-score:

The distribution of scores for five students in a test is 5, 8, 10, 4, and 3. Find the Z-score
and T-score for a student who scored 10 marks.
Score (X) ( x – x̄ ) ( x – x̄ )2

5 5-6=-1 1

8 8-6=2 4

10 10-6=4 16

4 4-6=-2 4

3 3-6=-3 9

N=5 ∑ =34

Min ( Mean ) x̄ = 6
Skor Z for pupils scored Skor T for pupils scored
10 marks : 10 marks :
σ= σ
x – x̄ / =50+10z
=50+10 ( 1.37 )
√(∑(x−x̄ )2 /n) 10-6/ 2.91 = 63.7
= √ 34/5-1 =1.37
=2.91
• These scores for z-scores and T-scores are used in
statistical exams to indicate the extent of deviation
from the predicted mean of your statistical estimate.
• If your exam yields a z-score of 1.79, it means your
estimated standard deviation is 2.23 from the predicted
mean.
• Both standard scores are very useful for comparing
achievements between subjects where the mean
and standard deviation differ significantly.
• Without converting to standard scores for various
subjects, the comparisons made may be less accurate
Standardized score table for student X" in
English.
Standardized Score

Score (X) x̄ σ Score Z Score T


English 70 5 2 70
Bahasa Melayu 88 8 -1 40

Maths 56 5 2 70

Skor Z for pupils scored 80 Skor T for pupils scored


marks : 80 marks :
σ
x – x̄ / =50+10z
=50+10 (-1)
80-88/8 =40 (Failed) if
=-1 the passing
• Comparison of subject achievement scores for a student
can be made through standard deviations using z-
scores and T-scores.
• Based on the information from table above, it can be
inferred that the student's performance in English
and Math's subjects is equally good because their
scores are the same, i.e., 70.
• Although the student scored high in the Bahasa
Melayu subject, their standard score is low, i.e., 40,
due to the high standard deviation (8) and mean (88).
• The z-score for the Bahasa Melayu subject is (-1),
indicating that the student's achievement is below
the mean.
• Therefore, if the passing mark is 50, even though the
student scored (80), they actually failed in Bahasa
summary
• In both T-scores and Z-scores are ways to standardize
data by expressing them in terms of standard
deviations from the mean, but they differ in the
context in which they are used, and the formulas used
to calculate them.
Interpretation
• The value of the z-score tells you how many standard
deviations are away from the mean. If a z-score is equal to 0,
it is on the mean.

• A positive z-score indicates the raw score is higher than


the mean average. For example, if a z-score is equal to +1,
it is 1 standard deviation above the mean.
• A negative z-score reveals the raw score is below the
mean average. For example, if a z-score is equal to -2, it is
two standard deviations below the mean.
• Another way to interpret z-scores is by creating a standard
normal distribution, also known as the z-score distribution,
or probability distribution

You might also like