Data Management
Data Management
DATA MANAGEMENT
(STATISTICS)
Introduction
B. TABULAR PRESENTATION
➔ Used when the scores are close to each other and when values between
or among scores follow the same pattern
➔ Obtained by getting the sum of all values divided by the number of cases
Formulas:
,
Formulas:
Formulas:
B. The Median ( )
➔ a score point which divides a ranked distribution into two equal
parts
➔ it is the value below which lies 50% of the data.
➔ Appropriate when there are values which are relatively large or
relatively small compared to most of the scores
➔ Appropriate when open-ended intervals (for grouped scores) are
involved
➔ Associated with ordinal data
➔ Best used when a distribution is positively or negatively skewed
Formulas:
where:
➔l l = lower real limit of the modal class
➔f = frequency of the modal class
➔f1 = frequency of the class preceding the modal class
➔f2 = frequency of the class following the modal class
➔i = interval
Measures of Position
Fractiles
➔ these are score points which divide a distribution into 4 equal parts
➔ Where:
= quartile level
Fup = fup n/4
F = frequency of the step containing n/4
i = interval
l.l. = lower real limit of the class containing n/4
The Dec il es
Where:
= percentile level
Fup = fup n/100
F = frequency of the step containing n/100
i = interval
l.l. = lower real limit of the class containing n/100
MEASURE OF DISPERSION/ VARIABILITY
MEASURES OF VARIABILITY
➔ difference between the highest value and the lowest value in the data
➔ for grouped data, the range is estimated by subtracting the lower real
limit of the lowest class interval from the upper real limit of the
highest class interval
➔ describes how far the highest value is from the lowest value but does
not tell anything about the scores between the two extreme values
➔ easily determined.
B. INTERQUARTILE RANGE (IR)
AND QUARTILE DEVIATION (QD)
➔ measures are generally more desirable than the range when the
distribution is truncated or skewed or when the median is the only
measure of central tendency that is available
➔ the IR indicates the distance between the two values which determine
the middle 50% of all the observations within the distribution.
Formulas
Where :
Q3 – 3rd quartile
Q1 – 1st quartile
C. MEAN DEVIATION or AVERAGE DEVIATION
➔ ungrouped
For data:
where :
➔ X - individual score or the midpoint
➔ - mean
➔ N – number of cases
D. VARIANCE AND STANDARD DEVIATIONS
where:
➔ μ – population mean
➔ N – population size
VARIANCE AND STANDARD DEVIATIONS
➔ Useful when the purpose it to reflect how large the variation is relative to the
average.
COEFFICIENT OF VARIATION
➔ lesser CV means that the set of data are relatively less scattered
about the mean than a distribution with a higher CV