100% found this document useful (1 vote)
589 views

Module 3 Descriptive Statistics Final

This module discusses descriptive statistics and covers measures of central tendency (mean, median, mode), measures of variability (range, standard deviation, variance), and frequency distributions. The key learning objectives are to differentiate the measures of central tendency, identify measures of variability, and organize and display data using tables and graphs. Descriptive statistics are used to summarize collected data by aggregating individual scores in a way that provides an overview of how the data is distributed.

Uploaded by

Jordine Umayam
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
589 views

Module 3 Descriptive Statistics Final

This module discusses descriptive statistics and covers measures of central tendency (mean, median, mode), measures of variability (range, standard deviation, variance), and frequency distributions. The key learning objectives are to differentiate the measures of central tendency, identify measures of variability, and organize and display data using tables and graphs. Descriptive statistics are used to summarize collected data by aggregating individual scores in a way that provides an overview of how the data is distributed.

Uploaded by

Jordine Umayam
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Course Code:

Course Title

MODULE NO. 03

TITLE DESCRIPTIVE STATISTICS

OVERVIEW This module deals with the Measures of Central


Tendency, Measures of variability, Frequency
Distribution and Running Descriptive Statistics
through SPSS.

I
NTRODUCTION This module will show the statistical methods that
can be used to summarize data. After collecting
data, researchers are faced with pages of
unorganized numbers, stacks of survey responses,
etc. The goal of descriptive statistics is to
aggregate the individual scores (datum) in a way
that can be readily summarized. A frequency
distribution table can be used to get “picture” of
how scores were distributed. It organizes and
presents large data sets using tables and graphs.

LEARNING OUTCOMES The students will learn the Measures of Central


Tendency, Measures of Variability and Frequency
Distribution.

LEARNING OBJECTIVES At the end of the discussion, the learners would be


able to:

1. Differentiate the measures of Central


Tendency, its uses and limitations.
2. Identify the Measures of Variability.
3. Organize and display data using tables and
graphs.

Let’s read
Measures of Central Tendency

A measure of central tendency is a summary statistic that represents the center


point or typical value of a data set. These measures indicate where most values in a
distribution fail and are also referred to as central location of a distribution. You can
think of it as the tendency of data to cluster around a middle value, In statistics, the
three most common measures of central tendency are the mean, median, and mode.
Each of these measures calculates the location of the central point using a different
method. Choosing the best measure of central tendency depends on the type of data
you have.

 Three measures of central tendency


 The Mean
 The Median
 Mode

Unfortunately, no single measure of central tendency works best in all circumstances.


Nor will they necessarily give you the same answer.

Example
SAT scores from a sample of 10 college applicants yielded the following:
 Mode: 480
 Median: 505
 Mean: 526

Which measure of central tendency is most appropriate?

 The Mean

 The mean is simply the arithmetic average.


 The mean would be the amount that each individual would get if we took the total
and divided it up equally among everyone in the sample.
 Alternatively, the mean can be viewed as the balancing point in the distribution
of
scores (i.e., the distances for the scores above and below the mean cancel out).

 The Median

 The median is the score that splits the distribution exactly in half.
 50% of the scores fall above the median and 50% fall below.
 The median is also known as the 50th percentile, because it is the score at
which 50% of the people fall below.
Special Notes
 A desirable characteristic of the median is that it is not affected by extreme
scores.

Example:
  Sample 1: 18, 19, 20, 22, 24
 Sample 2: 18, 19, 20, 22, 47

 Thus, the median is not distorted by skewed distributions.

 The Mode

 The mode is simply the most common score.


 There is no formula for the mode.
 When using a frequency distribution, the mode is simply the score (or
interval)
that has the highest frequency value.
 When using a histogram, the mode is the score (or interval) that corresponds
to the tallest bar.

Uses of the Measures of Central Tendency

The Mean is used . . .


 For interval and ratio measurements
 When there are no extreme values in a distribution since it is easily affected be
extremely high or extremely low scores.
 When higher statistical computations are wanted.
 When the greatest reliability of the measure of Central tendency is wanted since its
computation include all the given values.

The median is used . . .


 For ordinal and ranked measurements
 When there are extreme values, thus, the distribution is markedly skewed.
 For an open-end distribution, that is, the lowest or the highest class interval or both
are defined (i.e. 50 and below or 100 and above)
 When one desires to know whether the cases fall within the upper halves or the
lower halves of a distribution.

The Mode is used . . .


 For nominal and categorical data.
 When a rough or quick estimate of a central value is wanted.
 When the most popular or the most typical case or value in a distribution is wanted.
Limitations of the Measure of Central Tendency

Limitations of the Mean . . .


 It is the most widely used average, since it is the most familiar. However, it is often
misused. It can not be used if the clustering of values or items is not substantial.
 If the given values do not tend to cluster around a central value, the mean is a poor
measure of central location.
 It is easily affected by extremely large or small values. One small value can easily
pull down the mean.
 The mean can not be used to compare distribution since the means of 2 or more
distributions may be the same but their other characteristics may be entirely
different. The means of distribution A whose values are 80, 85, and 90 and
distribution B whose values are 86, 85, 84 are both 85. We can not imply however,
that both distributions posess the same characteristics since their patterns of
dispersions or variations are markedly different despite having the same mean.

Limitations of the Median . . .


 It is easily affected by the number of items in a distribution.
 It can not be determined if the given values are not arranged according to
magnitude.
 If several values are contained in a distribution, it becomes laborious task to arrange
them according to n
 magnitude.
 Its value is not accurate as the mean since it is just an ordinal statistic.

Limitations of the Mode . . .


 It is seldom or rarely used since it does not always exist.
 Its value is just a rough estimate of the center of concentration of a distribution.
 It is very unstable since its value easily changes depending on the approaches used
in finding it.
Distribution Shape and Central Tendency

 In a normal distribution, the mean, median, and mode will be


approximately equal.

Med
Mo

Skewed Distribution

 In a skewed distribution, the mode will be the peak, the mean will be pulled toward
the tail, and the median will fall in the middle.
Mo Med x́

Choosing the Proper Statistic

 Continuous Data
 Always report the mean
 If data are substantially skewed, it is appropriate to use the median as well

 Categorical Data
 For nominal data you can only use the mode
 For ordinal data the median is appropriate (although
people often use the
mean)

Measures of
Variability

A measure of variability is a summary statistic that represents the amount of dispersion


in a dataset. How spread out are the values. Measures of variability define how far
away the data points tend to fall from the center. We talk about variability in the context
of a distribution of values. A low dispersion indicates that the data points tend to be
clustered tightly around the center. High dispersion signifies that they tend to fall further
away.

In statistics, variability, dispersion, and spread are synonyms that denote the width of
the distribution, Just as there are multiple measures of central tendency, there are
several measures of variability. The most common measures of variability – the range,
variance and standard deviation.

 Measure of Variability
 Range
 Standard Deviation
 Variance

 Range
 Range is the distance between two extreme scores.
 It informs us about the dispersion of our distribution.
 The larger the range the larger the dispersion from the mean value.
 Although the mean of the scores of two distributions can be identical their ranges
may be different.

Drawbacks to the Range

 Good preliminary measure, but one single extreme value can influence the range
significantly.

 The calculation of the range is derived from the highest and lowest values and
doesn’t tell us anything about the variability of the different values.

 Standard Deviation

 Defined as the variability of the scores around the mean

 Each score in a distribution varies from the mean by a greater or lesser amount,
except when the score is the same as the mean.

 Deviations from the mean can be noted as either positive or negative deviations from
the mean.

 The average of these deviations would equal “zero.”

 Large SD
 Small SD

 Variance

 The variance and the closely-related standard deviation are measures of how spread
out a distribution is.

Frequency Distribution Tables

Overview

 After collecting data, researchers are faced with pages of unorganized numbers,
stacks of survey responses, etc.

 The goal of descriptive statistics is to aggregate the individual scores (datum) in a


way that can be readily summarized.

 A frequency distribution table can be used to get “picture” of how scores


were distributed.

Frequency Distributions

 A frequency distribution displays the number (or


percent) of individuals that obtained a particular score or fell in a particular category.
 As such, these tables provide a picture of where people respond across the range of
the measurement scale.

 One goal is to determine where the majority of respondents were located.

When to Use Frequency Tables

 Frequency distributions and tables can be used to answer all descriptive research
questions.

 It is important to always examine frequency distributions on the IV and DV when


answering comparative and relationship questions.

Three Components of a Frequency Distribution Table

 Frequency - the number of individuals that obtained a particular score (or


response).

 Percent - the corresponding percentage of individuals that obtained a particular


score.

 Cumulative Percent - the percentage of AGE RECOMMENDED


individuals that fell at or below a particular score 31 2
(not relevant for nominal variables).
26 3
32 4
Example 37 5
18 4
 What are the ages of students in an online
course? 31 5
38 4
 Are students likely to recommend the course to 49 2
others?
35 4
37 3
43 4
41 5
49 4
40 2
 Step 1: Input the Data into SPSS

Step 2: Run the Frequencies

 Analyze Descriptive Statistics Frequencies


 Move variables to the Variables box (select the variables and click on the arrow).
 Click OK.

Example

 Frequency distribution showing the ages of students who took the online
course.

 Student responses when asked whether or not they would recommend the online
course to others.
 Most would recommend the course.

Running Descriptive Statistics

Example
 Are there differences in the anxiety levels of STATS ANXIETY
students who have had statistics before HISTORY SCORE
versus students who have never had 1 95
statistics?
1 85
 Step 1: Input the data into SPSS 1 65

1 90

1 85

2 65

2 45
2 35

2 75

2 65

 Step 2: Run the descriptive statistics

 Analyze Compare Means Means

 Anxiety = Dependent List Stats History = Independent List

 Click Options
 Move Median over
 Move Minimum over
 Move Maximum over

 Click Continue
 Click OK
 Step 3: Create a Histogram for Anxiety with a normal curve option

 Graphs Legacy Dialogues Histogram


 Variable = anxiety
 Check the “Display normal curve” check box
 Click Ok

Histogram for Anxiety


Step 4:
Write up
the results
 Descriptive statistics revealed that students who had previous experience with
statistics (M = 57.00, SD = 16.43) had lower anxiety at the beginning of the semester
than students who did not have any previous experience with statistics (M = 84.00, SD
= 11.40).

Summary of when to use the mean, median and mode

Please use the following summary table to know what the best measure of central
tendency is with respect to the different types of variable.

Type of Variable Best measure of central tendency


Nominal Mode
Ordinal Median
Interval/Ratio (not skewed) Mean
Interval/Ratio (skewed) Median

Activity
Direction : Prepare an SPSS frequency distribution table and histogram for the
following data. Do a screen shot of both input (both data view and variable
view) and the output and convert it into pdf format. Submit the pdf format.
Also upload the input and output SPSS file.

1. The following are the height (in cm) of applicants in the PNP

177 192 163 198 175


208 192 186 164 169
189 172 165 206 182
184 193 173 164 162
168 165 185 182 173
189 186 169 169 192
201 210 187 201 162
188 202 175 181 205
163 198 196 166 202
168 187 182 192 186
191 177 163 171 185
177 198 178 208 208

a. Find the Mean age


b. Find the median age
c. What is the maximum age
d. What is minimum age
e. Find the 1st Quartile, 3nd Quartile and the 5th Decile

2. Bailey has been playing golf on the weekends for the past three years. Recently, she
started keeping track of her recorded scores. Her scores for June and July at her
favorite 9-hole (par 36) golf course are provided below.

45
49
42
56
41
36
34
38
41
40
42
41
39
38
40
39
36
41

 Find the Range, Standard Deviation, and Variance for the above data.
 What does this information tell you about the variability of Bailey's golf game?

Prepared by:

LILIA B. CATULIN, D.P.A.


Professor

You might also like