Statistics 3: DR Taher
Statistics 3: DR Taher
DR Taher
S
Measurements of central tendency
S If we get the mean of the sample, we call it the sample mean
and it is denoted by (read “x bar”).
Mean
Question
S N=15
= 15.2
Mean
S Advantages:-
S Easy to calculate
S It is unique in that a data set has one and only one mean.
S This is usually the best choice to describe data unless there is an
outlier.
S Disadvantages:-
S It affects by the outlying values
S Example:
2,5,6,8,11,14
Mean = 132/6 = 22
Median
S Median:- the median is the value that divides a dataset into
two equal parts so that the number of values that are greater
than or equal to the median is equal to the number of values
that are less than or equal to the median.
S Median is what divides the data in the distribution into two
equal parts.
S Fifty percent (50%) lies below the median value and 50% lies
above the median value.
Median
S if X = [1,2,4,6,9,10,12,14,17]
S if X = [1,2,4,6,9,10,11,12,14,17]
then 9.5 is the median; i.e., (9+10)/2
Median
S if X = [1,2,4,7,7,7,8,10,12,14,17]
S This is usually the best choice to describe data if you want to
select the most popular value or item.
Mode
S When all values appear the same number of times the idea
of a mode is not useful. But we could group them to see if
one group has more than the others.
Mode
S Example: {4, 7, 11, 16, 20, 22, 25, 26, 33}
S In groups of 10, the "20s" appear most often, so we could choose 25 as
the mode.
S You could use different groupings and get a different answer!
Quartile
S The first quartile Q1 is the number below which lies the 25 percent of
the bottom data.
S The second quartile Q2(the median) divides the range in the middle
and has 50 percent of the data below it.
S The third quartile Q3 has 75 percent of the data below it and the top
25 percent of the data above it.
Quartile
S Example: 5, 8, 4, 4, 6, 3, 8
S Q1= 4
Measures of variation (Dispersion)
S Dispersion :
S Example
S [0,25,50,75,100]
S Example:
S The range ignores how data are distributed and only takes
the extreme scores into account
Dispersion: Interquartile Range
S The height of the curve decreases as one moves away from
the mean in either direction, approaching, but never
reaching zero
Characteristics of a Normal
Distribution
S About 68.3% of the area under a normal curve is within one
standard deviation (SD) of the mean
S The mean, median, and the mode are all coincide
S The distance of value x from the mean of curve in units of standard
deviation is called relative deviate or standard normal vitiate “Z”
score
Symmetry: Skew
S One side is more spread out that the other, like a tail
Skewed left
Skewed right