Lesson 2 StatAna
Lesson 2 StatAna
Indeed, one of the most basic statistical − to denote the mean of a population by the
concepts involves finding measures of central Greek letter μ (lowercase mu)
tendency of a set of numerical data. It is often
helpful to find numerical values that locate, in Sample
some sense, the center of a set of data. − any subset of the population
− to denote the mean of a sample x̄ (which is
Measure of Central Tendency read as “x bar”)
− is the point about which the scores tend to
cluster around. It is the center of concentration MEAN FOR UNGROUPED DATA
of scores in any set of data.
(a) Arithmetic Mean or Mean
− It is also a value which gives a summary of the − the mean is the sum of all the given values or
characteristics of a given set of data. items in a distribution divided by the number of
values or items summed.
THREE MEASURES OF CENTRAL TENDENCY
− In the slide, we are given examination scores of
1. Mean
10 students whose sum is equal to 820. Divide the
2. Median
3. Mode sum of scores by 10, the result is a mean score of
82.
THE MEAN
representative or typical value in a set of numerical data MEAN FOR UNGROUPED DATA
Definition:
Arithmetic Mean/Mean The mean of n numbers is the sum of the numbers
− is the sum of all the given values or items in a divided by n.
distribution divided by the number of values or Illustration:
items summed. Find the mean of 10 students whose scores in their
final examination are 78, 81, 76, 74, 92, 73, 84, 96,
TWO TYPES OF DATA 87 and 79.
In its calculation, two types of data are involved.
Solution:
Ungrouped Data
− refers to data not yet organized into frequency
distribution.
Grouped Data
− refers to a set of data presented in the form of
frequency distribution.
Aside from the simple mean, we have another type of mean for
− are structured or classified into categories for ungrouped data which is called the weighted mean.
better presentation and analysis.
(b) Weighted Mean
In Statistics, it is often necessary to find the − a value, called the weighted mean, is often
sum of a set of numbers. used when some data are more important than
others.
Greek letter sigma (∑)
− traditional symbol used to indicate a
summation.
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024
where:
fi = frequency in each 3. The third property is that the algebraic sum of
class the deviations of the various values from the
n = total no. of arithmetic mean equals zero.
observations Illustration:
The scores 8, 4, 2, 6, 7 has a mean of 5.4. The
Example: deviations of the numbers from their arithmetic
mean are 2.6, -1.4, -3.4, 0.6, and 1.6, respectively,
Compute the mean wage of 20 employees of ABC
Company using both the long and short method. with algebraic sum equal to zero.
MEDIAN
positional or middle value
Median
− is a point in the distribution of scores at which
50 percent of the scores fall below and 50
percent of the scores fall above. In short, it is a
value that divides an array into two equal
parts, that is, a point in a set of variates above
which are an equal number of cases as there
are below it.
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024
(b) 4, 8, 10, 14, 18, 19, 23, 24 2. The median is an ordinal statistic since its
calculation is based on the ordinal properties of
(8 values – even) Md = (14+18)/2 = 16
the data being analyzed.
The median is the value of the middle item 3. The median is not amenable to further
after arranging all the items in the group in either computations.
ascending or descending order. In the illustration,
two sets of values are given, the first set of which has 4. The median is not affected by extreme values
odd number of values which automatically since it is a positional measure. The highest value
in a distribution does not enter into the
contains a middle item as its median. However, the
second set of values has an even number of values, computation of the median.
therefore, it has two middle items, the average of 5. In an open-ended distribution, the median is the
which is the median. most reliable measure of central tendency that
can be computed.
MEDIAN FOR GROUPED DATA
6. The medians of different distributions cannot be
Median Formula combined to give the median of the combined
distributions.
MODE
Most frequently occurring value
Mode
where: − is the most frequently appearing score or group
LB md = lower boundary of median class of scores appearing in a distribution.
CF < = cumulative frequency of the preceding class
MODE FOR UNGROUPED DATA
c = class interval The mode for ungrouped data is the
c= UB - LB observation that occurs most frequently. If two
observations are tied for the highest frequency, the
Problem: set of data is said to be bimodal. If there is no value
Determine the median wage of 20 employees of occurs more than one, then there is no mode.
ABC Company.
Illustration:
Find the modes of the following sets of values.
a. 12, 29, 35, 36, 36, 45, 45, 45, 50, 53
Mo = 45 (unimodal)
b. 8, 7, 6, 5, 6, 9, 2, 3, 11, 14, 11
Mo = 6, 11 (bimodal)
c. 2, 5, 7, 8, 9, 12
Mo = no mode
d. 2, 2, 2, 3, 3, 3, 4, 4, 4
Mo = none
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024
Looking at the given illustrations, we could see the 4. It is appropriate to use the mode as a measure
different situations wherein there are times the of central tendency if the distribution is bimodal.
mode is not present in a set of numerical data and
there are times sets of data may have more than PURPOSES IN COMPUTING THE MEASURES OF
one mode. CENTRAL TENDENCY
1) We compute the MEAN to:
MODE FOR GROUPED DATA
1.1 Describe a set of data whose values are
The mode for ungrouped data is the
close to each other.
observation that occurs most frequently. If two
observations are tied for the highest frequency, the 1.2 Compare two or more sets of data where
set of data is said to be bimodal. If there is no value variations of values among sets follow the
occurs more than one, then there is no mode. same pattern or the distributions have the
same characteristics.
Modal class contains the highest frequency
1.3 Have a stable and reliable measure.
1.4 Be used for further statistical computation
as measure of standard deviation and
others.
Solution:
Properties of Mode
1. The mode is used for nominal data. Its
computation depends on the frequency of
occurrence.
2. It indicates roughly the center of concentration
of a distribution.
3. The mode is a very unstable value. It can
change radically if the method of rounding
data is changed.
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024
Second Formula:
LESSON 2.2: MEASURES OF DISPERSION
OVERVIEW
Variability
− refers to the extent to which the scores on a
quantitative variable in a distribution are
spread out.
− The most common measure of variability is the
standard deviation.
1. Compute the standard deviation of scores 7, 10,
− The calculations of the measures of variability 14, 19, and 25 using the first and second formula.
depend on whether the data are taken from
the sample or from the population.
Added info:
One of the important characteristics of any
set of data is that not all values are alike, but the
extent to which they are unalike or vary among
themselves is important in statistics.
The measure of dispersion measures the
extent to which data are dispersed or spread out. It
serves as a supplement to central tendency, and at
the same time, gives meaning to the measures of
central tendency.
The measures of variation indicate the
nature or degree of clustering. The more
concentrated the values about the mean or
average, the more meaningful is the average as a
measure of location.
VARIANCE
− is the mean of the squared deviations of the
observations from the mean. It is a measure of
variablility that considers the position of each
observation relative to the mean of the set of
scores.
Formula:
where:
s2 = variance
x̄ = mean
Second Formula: