0% found this document useful (0 votes)
4 views

Lesson 2 StatAna

The document discusses measures of central tendency including the mean, median, and mode. It defines these terms and provides examples of calculating each measure using both ungrouped and grouped data. Properties of the mean and median are also covered.

Uploaded by

Iya Garcia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lesson 2 StatAna

The document discusses measures of central tendency including the mean, median, and mode. It defines these terms and provides examples of calculating each measure using both ungrouped and grouped data. Properties of the mean and median are also covered.

Uploaded by

Iya Garcia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING

SECOND YEAR – FIRST SEM | A.Y. 2023-2024

LESSON 2.1: MEASURES OF CENTRAL TENDENCY Summation Notation (∑x)


OVERVIEW − notation use to denotes the sum of all the
If one who works with statistics intends to numbers in each set. We often define the
have a set of quantitative measures to have a mean using the summation notation.
glimpse of the form of distribution and the
characteristics of the population from where the Statisticians often collect data from small
data were collected, he is to have measures which portions of a large group to determine information
summarize such data. Further, he is to calculate a about the group.
single number, which is typical of the general level Population
of magnitudes of the measurements in a set known − the entire group under consideration, and any
as the measure of central tendency. subset of the population.

Indeed, one of the most basic statistical − to denote the mean of a population by the
concepts involves finding measures of central Greek letter μ (lowercase mu)
tendency of a set of numerical data. It is often
helpful to find numerical values that locate, in Sample
some sense, the center of a set of data. − any subset of the population
− to denote the mean of a sample x̄ (which is
Measure of Central Tendency read as “x bar”)
− is the point about which the scores tend to
cluster around. It is the center of concentration MEAN FOR UNGROUPED DATA
of scores in any set of data.
(a) Arithmetic Mean or Mean
− It is also a value which gives a summary of the − the mean is the sum of all the given values or
characteristics of a given set of data. items in a distribution divided by the number of
values or items summed.
THREE MEASURES OF CENTRAL TENDENCY
− In the slide, we are given examination scores of
1. Mean
10 students whose sum is equal to 820. Divide the
2. Median
3. Mode sum of scores by 10, the result is a mean score of
82.
THE MEAN
representative or typical value in a set of numerical data MEAN FOR UNGROUPED DATA
Definition:
Arithmetic Mean/Mean The mean of n numbers is the sum of the numbers
− is the sum of all the given values or items in a divided by n.
distribution divided by the number of values or Illustration:
items summed. Find the mean of 10 students whose scores in their
final examination are 78, 81, 76, 74, 92, 73, 84, 96,
TWO TYPES OF DATA 87 and 79.
In its calculation, two types of data are involved.
Solution:
Ungrouped Data
− refers to data not yet organized into frequency
distribution.
Grouped Data
− refers to a set of data presented in the form of
frequency distribution.
Aside from the simple mean, we have another type of mean for
− are structured or classified into categories for ungrouped data which is called the weighted mean.
better presentation and analysis.
(b) Weighted Mean
In Statistics, it is often necessary to find the − a value, called the weighted mean, is often
sum of a set of numbers. used when some data are more important than
others.
Greek letter sigma (∑)
− traditional symbol used to indicate a
summation.
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024

Definition: Properties of the Mean


The weighted mean of the n numbers x1, x2, x3, ..., 1. The presence of an extreme value that is very
xn with respective assigned weights of w1, w2, w3, large or very small in a series of data will have a
..., wn is: pronounced effect upon the arithmetic mean.
Illustration:
The scores of 7 students in a quiz are 84, 80, 78, 85,
82, 80 and 24, and the mean is 73. The single low
mark of 24 has pulled the mean down to a value of
73, but this does not picture the distribution of scores
for we cannot say that the scores are scattered
Where ∑ (w. x) is the sum of the products formed around 73. Better say, “Except for 24, which is an
by multiplying each number by its assigned weight, extremely low score, the mean score of the group is
and ∑w is the sum of all weights. 81.5.”

Example: 2. Adding a constant to each observation


There are 1,000 notebooks sold at Php10 each; 500 increases the mean by the amount of constant
notebooks at Php20 each; 500 notebooks at Php25 added. If a constant is subtracted from each
each; and 100 notebooks at Php30 each. score, the mean is decreased by the amount of
Compute the weighted mean. the constant. When each score is multiplied (or
divided) by a constant, the mean is multiplied
Solution: (or divided) by that constant.
Illustration:

MEAN FOR GROUPED DATA


Midpoint or Long Method

where:
fi = frequency in each 3. The third property is that the algebraic sum of
class the deviations of the various values from the
n = total no. of arithmetic mean equals zero.
observations Illustration:
The scores 8, 4, 2, 6, 7 has a mean of 5.4. The
Example: deviations of the numbers from their arithmetic
mean are 2.6, -1.4, -3.4, 0.6, and 1.6, respectively,
Compute the mean wage of 20 employees of ABC
Company using both the long and short method. with algebraic sum equal to zero.

4. Also, the sum of the squares of the deviations of


Long Method:
values from the arithmetic mean is a minimum,
i.e., it is less than the sum of the squares of the
deviations around any other value. The sum of
the squares of the deviations taken from the
mean are (2.6)2 + (-1.4)2 + (-3.4)2 + (0.6)2 +
(1.6)2 = 23.2.

MEDIAN
positional or middle value

Median
− is a point in the distribution of scores at which
50 percent of the scores fall below and 50
percent of the scores fall above. In short, it is a
value that divides an array into two equal
parts, that is, a point in a set of variates above
which are an equal number of cases as there
are below it.
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024

MEDIAN FOR UNGROUPED DATA Solution:


MEDIAN FOR UNGROUPED DATA
Definition:
The median is the value of the middle item after
arranging all the items in the group in either
ascending or descending order.
Illustration:
Find the median of the two sets of values. Properties of the Median
(a) 4, 6, 9, 10, 12, 15, 17, 20, 28 1. The median is that point in an arranged set of
(b) 4, 8, 10, 14, 18, 19, 23, 24 variates above which are an equal number of
cases as there are below it. So, it is a point in a
Solution:
(a) 4, 6, 9, 10, 12, 15, 17, 20, 28 distribution such that 50% or 1⁄2 of the cases are
(9 values – odd) Md = 12 below it and 50% or 1⁄2 are above it.

(b) 4, 8, 10, 14, 18, 19, 23, 24 2. The median is an ordinal statistic since its
calculation is based on the ordinal properties of
(8 values – even) Md = (14+18)/2 = 16
the data being analyzed.
The median is the value of the middle item 3. The median is not amenable to further
after arranging all the items in the group in either computations.
ascending or descending order. In the illustration,
two sets of values are given, the first set of which has 4. The median is not affected by extreme values
odd number of values which automatically since it is a positional measure. The highest value
in a distribution does not enter into the
contains a middle item as its median. However, the
second set of values has an even number of values, computation of the median.
therefore, it has two middle items, the average of 5. In an open-ended distribution, the median is the
which is the median. most reliable measure of central tendency that
can be computed.
MEDIAN FOR GROUPED DATA
6. The medians of different distributions cannot be
Median Formula combined to give the median of the combined
distributions.

MODE
Most frequently occurring value

Mode
where: − is the most frequently appearing score or group
LB md = lower boundary of median class of scores appearing in a distribution.
CF < = cumulative frequency of the preceding class
MODE FOR UNGROUPED DATA
c = class interval The mode for ungrouped data is the
c= UB - LB observation that occurs most frequently. If two
observations are tied for the highest frequency, the
Problem: set of data is said to be bimodal. If there is no value
Determine the median wage of 20 employees of occurs more than one, then there is no mode.
ABC Company.
Illustration:
Find the modes of the following sets of values.
a. 12, 29, 35, 36, 36, 45, 45, 45, 50, 53
Mo = 45 (unimodal)
b. 8, 7, 6, 5, 6, 9, 2, 3, 11, 14, 11
Mo = 6, 11 (bimodal)
c. 2, 5, 7, 8, 9, 12
Mo = no mode
d. 2, 2, 2, 3, 3, 3, 4, 4, 4
Mo = none
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024

Looking at the given illustrations, we could see the 4. It is appropriate to use the mode as a measure
different situations wherein there are times the of central tendency if the distribution is bimodal.
mode is not present in a set of numerical data and
there are times sets of data may have more than PURPOSES IN COMPUTING THE MEASURES OF
one mode. CENTRAL TENDENCY
1) We compute the MEAN to:
MODE FOR GROUPED DATA
1.1 Describe a set of data whose values are
The mode for ungrouped data is the
close to each other.
observation that occurs most frequently. If two
observations are tied for the highest frequency, the 1.2 Compare two or more sets of data where
set of data is said to be bimodal. If there is no value variations of values among sets follow the
occurs more than one, then there is no mode. same pattern or the distributions have the
same characteristics.
Modal class contains the highest frequency
1.3 Have a stable and reliable measure.
1.4 Be used for further statistical computation
as measure of standard deviation and
others.

2) We compute the MEDIAN to:


where:
2.1 Determine the point of central value in a set
LBmo = lower boundary of modal class
of data containing extremely high or low
fmo = frequency of the modal class
scores in a skewed distribution.
f1 = frequency preceding the modal frequency
f2 = frequency after the modal frequency 2.2 Know the middle value such that cases
falling within the upper half and those
Problem: within the lower half in the distribution are
Find the modal wage of 20 employees of ABC determined.
Company.
2.3 Find the central value when data re ordinal
or ranked / data in an open-ended
distribution.

3) We compute the MODE to:


3.1 Estimate the central value quickly.
3.2 Know the most typical case.
3.3 Have a rough estimate of the central value,
as it is already sufficient.
3.4 Deal with data in nominal value

Solution:

Properties of Mode
1. The mode is used for nominal data. Its
computation depends on the frequency of
occurrence.
2. It indicates roughly the center of concentration
of a distribution.
3. The mode is a very unstable value. It can
change radically if the method of rounding
data is changed.
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024

Second Formula:
LESSON 2.2: MEASURES OF DISPERSION
OVERVIEW
Variability
− refers to the extent to which the scores on a
quantitative variable in a distribution are
spread out.
− The most common measure of variability is the
standard deviation.
1. Compute the standard deviation of scores 7, 10,
− The calculations of the measures of variability 14, 19, and 25 using the first and second formula.
depend on whether the data are taken from
the sample or from the population.

Added info:
One of the important characteristics of any
set of data is that not all values are alike, but the
extent to which they are unalike or vary among
themselves is important in statistics.
The measure of dispersion measures the
extent to which data are dispersed or spread out. It
serves as a supplement to central tendency, and at
the same time, gives meaning to the measures of
central tendency.
The measures of variation indicate the
nature or degree of clustering. The more
concentrated the values about the mean or
average, the more meaningful is the average as a
measure of location.

VARIANCE
− is the mean of the squared deviations of the
observations from the mean. It is a measure of
variablility that considers the position of each
observation relative to the mean of the set of
scores.
Formula:

where:
s2 = variance
x̄ = mean

STANDARD DEVIATION (SD)


− the positive square root of the variance

SD FOR UNGROUPED DATA


First Formula:
STATISTICAL ANALYSIS WITH SOFTWARE PROGRAMMING
SECOND YEAR – FIRST SEM | A.Y. 2023-2024

SD FOR UNGROUPED DATA


First Formula:

Second Formula:

2. Compute the standard deviation using the


different formulas using the first and second formula.

You might also like