Educ 502 1 1
Educ 502 1 1
and Statistics
I. Objectives
1. Demonstrate knowledge of statistical terms.
2. Differentiate between the two branches of statistics.
3. Identify types of data.
4. Identify the measurement level for each variable.
5. Identify the four basic sampling techniques.
6. Explain the difference between an observational and an experimental
What is statistics?
Statistics is the science of collecting data, organizing , summarizing and analyzing information to
draw conclusions or answer to the question. In addition statistics is about providing measure of
confidence in any conclusion
The science of collecting describing, and interpreting data ( Johnson & Kuby, 2007)
The science of conducting studies to collect, organize summarize analyze and draw conclusion
from the data ( Bluman 2008)
The art ans science of collecting ,analyzing and interpreting data ( Cochran, Anderson &
etl, 2013)
Study statistics for several reasons
Must read and understand the various statistical studies performed in their fields
To conduct research in their fields, since statistics are basic to research
To use knowledge gained from studying statistics to become better consumer and citizen
Descriptive and Inferential Statistics
Data can be used in different ways. The body of knowledge called statistics is sometimes divided
into two main areas, depending on how data are used. The two areas are
1. Descriptive statistics
2. Inferential statistics
Descriptive statistics consists of the collection, organization, summarization, and presentation of
data.
Inferential statistics consists of generalizing from samples to populations, performing estimations
and hypothesis tests, determining relationships among variables, and making predictions.
Here, the statistician tries to make inferences from samples to populations. Inferential statistics
uses probability, i.e., the chance of an event occurring. You may be familiar with the concepts of
probability through various forms of gambling.
A population consists of all subjects (human or otherwise) that are being studied.
Most of the time, due to the expense, time, size of population, medical concerns, etc., it is not
possible to use the entire population for a statistical study; therefore, researchers use samples
A sample is a group of subjects selected from a population.
An area of inferential statistics called hypothesis testing is a decision-making process for
evaluating claims about a population, based on information obtained from samples.
Qualitative variables are variables that can be placed into distinct categories, according to some
characteristics or attributes. For example , if the subjects are classified according to sex then
variable sex is qualitative. Other examples of qualitative variables are political affiliation, religious
preference , geographic location and more.
Quantitative variables are numerical and can be ordered, for example the variable age and people
can be ranked in order according to the value of their age. Other examples of quantitative variables
are the height, weight, body temperature and so on.
Discrete variables assume that value can be counted. Example number of children in a family,
number of students in the online class in statistics.
Continuous variables can assume an infinite number of values between any two specific values.
They are obtained by measuring. They often include fractions and decimals
Measurement level of the variables are : nominal, ordinal, interval and ratio. Nominal level of
measurement classifies data into mutually exclusive, exhausting categories in which no order of
ranking can be imposed on the data.
Ordinal level of measurement classifies data into categories that can be ranked; however, precise
differences between the ranks do not exist.
Interval level of measurement ranks data and precise differences between units of measure do
exist; however, there is no meaningful zero.
Ratio level of measurement possesses all the characteristics of interval measurement, and these
exist a true zero. In addition, true ratio exists when the same variable is measured by two different
members of the population.
Classify the level of measurement of the following:
Zip code Gender
Grade ( A, B, C, D F) Eye color
Judging( First, second, third) Rating Scale ( poor, good, excellent) SAT
score Political Affiliation
IQ Major Field ( Math ..Science)
Temperature Ranking of Tennis players
Height Nationality
Weight Age
Time Religion affiliation
Salary Score in the test
Exercises:
Classify each as nominal-level, ordinal-level, interval level, or ratio-level measurement.
a. Pages in the 25 best-selling mystery novels.
b. Rankings of golfers in a tournament.
c. Temperatures inside 10 pizza ovens.
d. Weights of selected cell phones.
e. Salaries of the coaches in the NFL.
f. Times required to complete a chess game.
g. Ratings of textbooks (poor, fair, good, excellent).
h. Number of amps delivered by battery chargers.
i. Ages of children in a day care center.
j. Categories of magazines in a physician’s office (sports, women’s, health, men’s,
news).
2. In each of these statements, tell whether descriptive or inferential statistics have been
used.
a. By 2040 at least 3.5 billion people will run short of water (World Future Society). b.
Nine out of ten on-the-job fatalities are men (Source: USA TODAY Weekend ). c.
Expenditures for the cable industry were $5.66 billion in 1996 (Source: USA TODAY ).
d. The median household income for people aged 25–34 is $35,888 (Source: USA TODAY ).
e. Allergy therapy makes bees go away (Source: Prevention).
f. Drinking decaffeinated coffee can raise cholesterol levels by 7% (Source: American
Heart Association).
g. The national average annual medicine expenditure per person is $1052(Source: The
Greensburg Tribune Review).
h. Experts say that mortgage rates may soon hit bottom (Source: USA TODAY ).
3. Classify each variable as qualitative or quantitative.
a. Marital status of nurses in a hospital.
b. Time it takes to run a marathon.
c. Weights of lobsters in a tank in a restaurant.
d. Colors of automobiles in a shopping center parking lot.
e. Ounces of ice cream in a large milkshake.
f. Capacity of the NFL football stadiums.
g. Ages of people living in a personal care home.
4. Classify each variable as discrete or continuous.
a. Number of pizzas sold by Pizza Express each day.
b. Relative humidity levels in operating rooms at local hospitals.
c. Number of bananas in a bunch at several local supermarkets.
d. Lifetimes (in hours) of 15 iPod batteries.
e. Weights of the backpacks of first graders on a school bus.
f. Number of students each day who make appointments with a math tutor at a
local college.
g. Blood pressures of runners in a marathon.
Random Sampling
Simple random sampling
Systematics Sampling - Researcher obtained systematic samples by numbering each subjects of
the population and selecting every kth subject. For example, suppose there are 2000 subjects in
the population and a sample of 50 subjects are needed. Since 2000 divided by 50= is 40 and every
40th subject would be selected, however, the first ( numbered between 1 and 40) will be selected
at random. Suppose 12 were the first subject then the sample consisted of the subjects whose
numbers were 12, 52, 92, etc. until the desired number of subjects were obtained.
Stratified Sampling is done by dividing the population into groups( called strata) according to
some characteristics that are important to the study, then sampling from each group.Sample within
the group should be randomly selected.
Cluster Sampling Here the population is divided into groups called clusters by some means such
as geographic area or school in a large district, etc. Then the researcher randomly selects some of
these clusters and uses all members of the selected clusters as the subject of the samples.
Checking the comprehension
Read the following information about the transportation industry and answer the questions.
Transportation Safety The chart shows the number of job-related injuries for each of the
transportation industries for 1998.
Industry Number of injuries
Railroad 4520
Intercity bus 5100
Subway 6850
Trucking 7144
Airline 9950
1. What are the variables under study?
2. Categorize each variable as quantitative or qualitative.
3. Categorize each quantitative variable as discrete or continuous.
4. Identify the level of measurement for each variable.
5. The railroad is shown as the safest transportation industry. Does that mean
railroads have fewer accidents than the other industries? Explain.
6. What factors other than safety influence a person’s choice of transportation?
7. From the information given, comment on the relationship between the
variables
Observational and Experimental Studies
There are several different ways to classify statistical studies. This section
explains two types of studies: observational studies and experimental studies.
In an observational study, the researcher merely observes what is happening or what has
happened in the past and tries to draw conclusions based on these observations
In an experimental study, the researcher manipulates one of the variables and tries to
determine how the manipulation influences other variables.
Statistical studies usually include one or more independent variables and one dependent
variableThe independent variable in an experimental study is the one that is being manipulated by
the researcher. The independent variable is also called the explanatory variable. The resultant
variable is called the dependent variable or the outcome variable.
The group that received the special instruction is called the treatment group while the
other is called the control group. The treatment group receives a specific treatment (in this case,
instructions for improvement) while the control group does not.
A confounding variable is one that influences the dependent or outcome variable but
was not separated from the independent variable
Example.
As the evidence on the adverse effects of cigarette smoke grew, people tried many different ways
to quit smoking. Some people tried chewing tobacco or, as it was called, smokeless tobacco. A
small amount of tobacco was placed between the cheek and gum. Certain chemicals from the
tobacco were absorbed into the bloodstream and gave the sensation of smoking cigarettes. This
prompted studies on the adverse effects of smokeless tobacco. One study in particular used 40
university students as subjects. Twenty were given smokeless tobacco to chew, and twenty given a
substance that looked and tasted like smokeless tobacco, but did not contain any of the harmful
substances. The students were randomly assigned to one of the groups. The students’ blood pressure
and heart rate were measured before they started chewing and 20 minutes after they had been
chewing. A significant increase in heart rate occurred in the group that chewed the smokeless
tobacco.
Answer the following questions.
1. What type of study was this (observational, quasi-experimental, or
experimental)?
(This was an experiment, since the researchers imposed a treatment on each of the
two groups involved in the study)
Exercises:
1. Identify each study as being either observational or experimental. a. Subjects were
randomly assigned to two groups, and one group was given an herb and the other group
a placebo. After 6 months, the numbers of respiratory tract infections each group had
were compared.
b. A researcher stood at a busy intersection to see if the color of the automobile that a
person drives is related to running red lights.
c. A researcher finds that people who are more hostile have higher total cholesterol
levels than those who are less hostile.
d. Subjects are randomly assigned to four groups. Each group is placed on one of four
special diets—a low-fat diet, a high-fish diet, a combination of low-fat diet and high-
fish diet, and a regular diet. After 6 months, the blood pressures of the groups are
compared to see if diet has any effect on blood pressure.
2. Classify each sample as random, systematic, stratified, or cluster.
a. In a large school district, all teachers from two buildings are interviewed to
determine whether they believe the students have less homework to do now than
in previous years.
b. Every seventh customer entering a shopping mall is asked to select her or his
favorite store.
e. Mail carriers of a large city are divided into four groups according to gender
(male or female) and according to whether they walk or ride on their routes.
Then 10 are selected from each group and interviewed to determine whether
they have been bitten by a dog in the last year.
Organizing Data
A frequency distribution is the organization of raw data in table form, using classes and
frequencies.
A frequency distribution consists of classes and their corresponding frequencies. Each raw
data value is placed into a quantitative or qualitative category called a class. The frequency of a
class then is the number of data values contained in a specific class. A frequency distribution is
shown for the preceding data set
Two types of frequency distributions that are most often used are the categorical frequency
distribution and the grouped frequency distribution. The categorical frequency distribution is
used for data that can be placed in specific categories, such as nominal- or ordinal-level data. For
example, data such as political affiliation, religious affiliation, or major field of study would use
categorical frequency distributions
Example.
Distribution of Blood Types Twenty-five army inductees were given a blood test
to determine their blood type. The data set is
A B B AB O
O O B AB B
BBOAO
A O O O AB
AB A O B A
Construct a frequency distribution for the data.
Solution:
Since the data are categorical, discrete classes can be used. There are four blood types: A,
B, O, and AB. These types will be used as the classes for the distribution. The procedure for
constructing a frequency distribution for categorical data is given next.
Step 1 Make a table as shown.
O ||||-|||| 8
AB ||||
Step 2 Tally the data and place the results in column B.
Step 3 Count the tallies and place the results in column C.
Step 4 Find the percentage of values in each class by using the formula
where f frequency of the class and n total number of values. For example, in the
class of type A blood, the percentage is
Percentages are not normally part of a frequency distribution, but they can be added
since they are used in certain types of graphs such as pie graphs. Also, the decimal
equivalent of a percent is called a relative frequency
Step 5 Find the totals for columns C (frequency) and D (percent). The completed
table is shown
Class
Tally Frequency
Percentage
A ||||- 5 20 B ||||-|| 7 28
Total
O
AB
||||-|||| 9 36 |||| 4 16 25 100
Grouped Frequency Distributions When the range of the data is large, the data must be grouped
into classes that are more than one unit in width, in what is called a grouped frequency
distribution. 24, 25, 26, 27, 28, 29 30
Class Classlimits boundaries Tally Frequency cf< cf > rf 24–30 23.5–30.5 |||
31–37 30.5–37.5 |
38–44 37.5–44.5 ||||-
45–51 44.5–51.5 ||||-||||- 10
52–58 51.5–58.5
59–65 58.5–65.5
TOTAL
In this distribution, the values 24 and 30 of the first class are called class limits. The lower class
limit is 24; it represents the smallest data value that can be included in the class. The upper class
limit is 30; it represents the largest data value that can be included in the class. The numbers in the
second column are called class boundaries. These numbers are used to separate the classes so that
there are no gaps in the frequency distribution. The gaps are due to the limits; for example, there is
a gap between 30 and 31. Students sometimes have difficulty finding class boundaries when given
the class limits. The basic rule of thumb is that the class limits should have the same decimal place
value as the data, but the class boundaries should have one additional place value and end in a 5.
For example, if the values in the data set are whole numbers, such as 24, 32, and 18, the limits for
a class might be 31–37, and the boundaries are 30.5–37.5. Find the boundaries by subtracting 0.5
from 31 (the lower class limit) and adding 0.5 to 37 (the upper class limit)
Lower limit -0.5 31 - 0.5 = 30.5 lower boundary
Upper limit +0.5 37 + 0.5 = 37.5 upper boundary
To construct a frequency distribution, follow these rules:
1. There should be between 5 and 20 classes. Although there is no hard-and-fast rule for
the number of classes contained in a frequency distribution, it is of the utmost importance
to have enough classes to present a clear description of the collected data.
2. It is preferable but not absolutely necessary that the class width be an odd number. This
ensures that the midpoint of each class has the same place value as the data. The class
midpoint is obtained by adding the lower and upper boundaries and dividing by 2, or
adding the lower and upper limits and dividing by 2:
Or
3. The classes must be mutually exclusive. Mutually exclusive classes have non overlapping class
limits so that data cannot be placed into two classes.
4. The classes must be continuous. Even if there are no values in a class, the class must be
included in the frequency distribution. There should be no gaps in a frequency distribution.
The only exception occurs when the class with a zero frequency is the first or last class. A
class with a zero frequency at either end can be omitted without affecting the distribution
5. The classes must be exhaustive. There should be enough classes to accommodate all the
data. 6. The classes must be equal in width. This avoids a distorted view of the data
Find the width by dividing the range by the number of classes and rounding up.
Select a starting point (usually the lowest value or any convenient number less than the
lowest value); add the width to get the lower limits.
Step 3 Find the numerical frequencies from the tallies, and find the cumulative frequencies.
All the different types of distributions are used in statistics and are helpful when one is
organizing and presenting data. The reasons for constructing a frequency distribution are as
follows: 1. To organize the data in a meaningful, intelligible way.
enable the researcher to draw charts and graphs for the presentation of data 5. To
Lets try a practice problem. Given these 90 scores, construct frequency distribution of
grouped scores having approximately
112 68 55 33 72 80 35 55 62
92 44 122 73 65 78 49 61 65
83 76 95 55 50 82 51 138 73
82 72 89 37 63 95 109 93 65
75 24 60 43 130 107 72 86 71
128 90 48 22 67 76 57 86 114
33 54 64 82 47 81 28 79 85
42 62 86 94 52 106 30 117 98
58 32 68 77 28 69 46 53 38
After you have organized the data into a frequency distribution, you can present them in
graphical form. The purpose of graphs in statistics is to convey the data to the viewers in pictorial
form. It is easier for most people to comprehend the meaning of data presented graphically than
data presented numerically in tables or frequency distributions. This is especially true if the users
have little or no statistical knowledge. Statistical graphs can be used to describe the data set or to
analyze it. Graphs are also useful in getting the audience’s attention in a publication or a speaking
presentation. They can be used to discuss an issue, reinforce a critical point, or summarize a data
set. They can also be used to discover a trend or pattern in a situation over a period of time.
1. The histogram.
The histogram is a graph that displays the data by using contiguous vertical bars (unless the
frequency of a class is 0) of various heights to represent the frequencies of the classes
The frequency polygon is a graph that displays the data by using lines that connect points plotted
for the frequencies at the midpoints of the classes. The frequencies are represented by the heights
of the points
The Ogive The third type of graph that can be used represents the cumulative frequencies
for the classes. This type of graph is called the cumulative frequency graph, or ogive. The
cumulative frequency is the sum of the frequencies accumulated up to the upper boundary of a
class in the distribution
The ogive is a graph that represents the cumulative frequencies for the classes in a
frequency distribution
Step 2 Choose a suitable scale for the frequencies or cumulative frequencies, and
Step 3 Represent the class boundaries for the histogram or ogive, or the midpoint for the
frequency polygon, on the x axis. Step 4 Plot the points and then draw the bars or
lines.
Distribution Shapes
When one is describing data, it is important to be able to recognize the shapes of the
distribution values. In later chapters you will see that the shape of a distribution also determines
the appropriate statistical methods used to analyze the data. A distribution can have many shapes,
and one method of analyzing a distribution is to draw a histogram or frequency polygon for the
distribution. Several of the most common shapes are the bell-shaped or mound-shaped, the uniform
shaped, the J-shaped, the reverse J-shaped, the positively or right-skewed shape, the negatively or
left-skewed shape, the bimodal-shaped, and the U-shape
Distributions are most often not perfectly shaped, so it is not necessary to have an exact
shape but rather to identify an overall pattern. A bell-shaped distribution shown in Figure (a) has a
single peak and tapers off at either end. It is approximately symmetric; i.e., it is roughly the same
on both sides of a line running through the center
A uniform distribution is basically flat or rectangular. See Figure b. A J-shaped distribution is
shown in Figure (c), and it has a few data values on the left side and increases as one moves to the
right. A reverse J-shaped distribution is the opposite of the J-shaped distribution. See Figure (d).
When the peak of a distribution is to the left and the data values taper off to the right, a distribution
is said to be positively or right-skewed. See Figure (e). When the data values are clustered to the
right and taper off to the left, a distribution is said to be negatively or left-skewed. See Figure(f)
When a distribution has two peaks of the same height, it is said to be bimodal. See Figure (g).
Finally, the graph shown in Figure (h) is a U-shaped distribution
Exercises.
1. Construct a histogram, frequency polygon, and ogive for the data using your prepared
frequency distribution in the previous exercises
2. Number of College Faculty The number of faculty listed for a variety of private colleges that
offer only bachelor’s degrees is listed below. Use these data to construct a frequency
distribution with 7 classes, a histogram, a frequency polygon, and an ogive. Discuss the
shape of this distribution. What proportion of schools have 180 or more faculty?
When the data are qualitative or categorical, bar graphs can be used to represent the data.
A bar graph can be drawn using either horizontal or vertical bars.
A bar graph represents the data by using vertical or horizontal bars whose heights or
lengths represent the frequencies of the data
Example
College Spending for First-Year Students The table shows the average money
spent by first-year college students. Draw a horizontal and vertical bar graph for the data.
Electronics $728
Clothing 141
Shoes 72
Solution
1. Draw and label the x and y axes. For the horizontal bar graph place the frequency
scale on the x axis, and for the vertical bar graph place the frequency scale on the y
axis. 2. Draw the bars corresponding to the frequencies. See Figure 2–10.
The graphs show that first-year college students spend the most on electronic equipment
including computers.
A Pareto chart is used to represent a frequency distribution for a categorical variable, and the
frequencies are displayed by the heights of vertical bars, which are arranged in order from highest
to lowes
The Time Series Graph When data are collected over a period of time, they can be represented by
a time series graph.
A time series graph represents data that occur over a specific period of
time. Example:
Workplace Homicides The number of homicides that occurred in the workplace for the
years 2003 to 2008 is shown. Draw and analyze a time series graph for the data.
Solution
Step 2 Label the x axis for years and the y axis for the number.
4 Draw line segments connecting adjacent points. Do not try to fit a smooth curve through
the data points.
From the figure it can be seen that there was a slight decrease in the years ’04, ’05, and
’06, compared to ’03, and again an increase in ’07. The largest decrease occurred in ’08.
The Pie Graph Pie graphs are used extensively in statistics. The purpose of the pie graph
is to show the relationship of the parts to the whole by visually comparing the sizes of the sections.
Percentages or proportions can be used. The variable is nominal or categorical.A pie graph is a
circle that is divided into sections or wedges according to the percentage of frequencies in each
category of the distribution
Stem and Leaf Plots The stem and leaf plot is a method of organizing data and is a
combination of sorting and graphing. It has the advantage over a grouped frequency distribution of
retaining the actual data while showing them in graphical form
A stem and leaf plot is a data plot that uses part of the data value as the stem and part of
the data value as the leaf to form groups or classes.
Exercises.
1. Math and Reading Achievement Scores The math and reading achievement scores from the
National Assessment of Educational Progress for selected states are listed below. Construct
a back-to back stem and leaf plot with the data and compare the distributions.
Math Reading
52 66 69 62 61 65 76 76 66 67 63 57 59 59 55 71 70 70 66 61 55 59 74 72 73
61 69 78 76 77 68 76 73 77 77 80
1. Summarize data, using measures of central tendency, such as the mean, median, mode, and
midrange.
2. 2Describe data, using measures of variation,such as the range, variance, and standard
deviation.
3. Identify the position of a data value in a data set, using various measures of position, such
as percentiles, deciles, and quartiles.
4. Use the techniques of exploratory data analysis, including boxplots and five-number
summaries, to discover various aspects of data.
Measures found by using all the data values in the population are called parameters.
Measures obtained by using the data values from samples are called statistic; hence, the average of
the sales from a sample of representatives is a statistic, and the average of sales obtained from the
entire population is a parameter.
A statistic is a characteristic or measure obtained by using the data values from a sample.
A parameter is a characteristic or measure obtained by using all the data values from a
specific population.
The mean is the sum of the values, divided by the total number of values. The
symbol represents the sample mean.
Where n represents the total number of values in the sample. For population , the greek
110 76 29 38 105 31
Solution
Using the frequency distribution for finding the mean is given here
must be arranged in order. When the data set is ordered, it is called a data array. The
median is the midpoint of the data array. The symbol for the median is MD.
Example:
. Find the median.713, 300, 618, 595, 311, 401, and 292
Solution:
Step I
Step 2
The Mode
The third measure of average is called the mode. The mode is the value that occurs most often in the data set. It
is sometimes said to be the most typical case.
The value that occurs most often in a data set is called the mode.
A data set that has only one value that occurs with the greatest frequency is said to be unimodal.If
a data set has two values that occur with the same greatest frequency, both values are considered
to be the mode and the data set is said to be bimodal. If a data set has more than two values that
occur with the same greatest frequency, each value is used as the mode, and the data set is said to
be multimodal. When no data value occurs more than once, the data set is said to have no mode. A
data set can have more than one mode or no mode at all
The Mean
1. The mean is found by using all the values of the data.
2. The mean varies less than the median or mode when samples are taken from the same
population and all three measures are computed for these samples.
4. The mean for the data set is unique and not necessarily one of the data values. 5.
The mean cannot be computed for the data in a frequency distribution that has an
open-ended class.
6. The mean is affected by extremely high or low values, called outliers, and may not be the
The Median
1. The median is used to find the center or middle value of a data set.
2. The median is used when it is necessary to find out whether the data values fall into the
4. The median is affected less than the mean by extremely high or extremely low values.
The Mode
3. The mode can be used when the data are nominal or categorical, such as religious
4. The mode is not always unique. A data set can have more than one mode, or the
The Midrange
Lower Cf <
Upper
13-19 2 2
20-26 7 9
27-33 12 21
34-40 15 33.5 36
41-47 9 45
48-54 3 48
55-61 5 53
62-68 2 55
Total 55
41-47 9
Class limit frequency(f)
48-54 3
13-19 2
20-26 7
16 32
27-33 12
23 161 30 360 37 555 44 396
34-40 15 51 153
48 54 3 2
A B C
55 61 5 3
f d
D
13 19 2 -3
fd
20 26 7-6 -2
-14 -12 0
27 33 12 -1
9
6
34 40 15 0
15
41 47 9 1
62 68 2 4 8
Total 55 6
-6 + 14 + -12= -32
9 + 6 + 15 +8= 38 38- 32 = 6 =+6
AM= 37
1. The zero deviation is placed to frequency
2. AM is the assumed mean which is the midpoint of the zero deviation
3. Multiple the frequency to the deviation column
4. Then add the fd column.
5. Substitute to the formula
13 19 2 2
20 26 7 9
27 33 12 21
34 40 15 36
41 47 9 45
48 54 3 48
55 61 5 53
62 68 2 55
55
Compute
cf<= 21
34-40 15 33.5
Class Bounderies 21
40.5 36
LCB= 33.5
f= 15
n= 55
i= 7
Interpretation: The median of 36.53 divides the distribution into upper 50% and
lower 50%
3. Computing the Mode
27-33
12
34-40 15 33.5 9
41-47
Measure of Variation
Consider the two set of score
SET A SET B
10 35
60 45
50 30
30 35
40 40
20 25
EDUC
5https://docs.google.com/spreadsheets/u/3/d/1OwAzwVgqMORVrRBl40Klix0wji
2 BGuMnFcdefsPlBT4/edit02
Since the means are equal in above, you might conclude that both brands of paint last equally well.
However, when the data sets are examined graphically, a somewhat different conclusion might be
drawn.
Even though the means are the same for both brands, the spread, or variation, is quite different.
Figure shows that brand B performs more consistently; it is less variable. For the spread or
variability of a data set, three measures are commonly used: range, variance, and standard
deviation. Each measure will be discussed in this section.
The variance is the average of the squares of the distance each value is from
the mean. The symbol for the population variance is s2 (s is the Greek lowercase
letter sigma). The formula for the population variance is
Where
X= individual value
N= population size
The standard deviation is the square root of the variance. The symbol for the
population standard deviation is s. The corresponding formula for the population
standard deviation is
Example
Find the variance and standard deviation of:
35, 45, 30, 35, 40, 25
Solution
1. Find the mean
2. Subtract the mean from each value, and place the result in the second
column ( )
3. Square each result and place the squares in column C of the table.
X
35 0 0
45 10 100
30 -5 25
35 0 0
40 5 25
25 -10 100
Class f
6-10 1
11-15 2
16-20 3
21-25 5
26-30 4
31-35 3
36-40 2
1. Find the mean
To find the mean find the midpoint and multiply the frequency to the midpoint as in the
column D. Then add the column D. Then the sum id to be divided by total frequency
which is 20.
36 40 2 38
A B C
Total 20
LC UC f
6 10 1 8
D
11 15 2 f Midpoint13
8
26
16 20 3 18
54
21 25 5 115 23
112
26 30 4 28
99
76
31 35 3 33
490
D
6 10 1 8
-16.5
-11.5
-6.5
21 25 5 3.
23Square the column D
Class B C
26 30 4 28
f
31 35 3 33
36 40 2 38 6 10 1 8
Total 11 15 2 13
3.5
16 20 3 18
8.5
21 25 5 23
13.5
26 30 4 28
31 35 3 33
36 40 2 38
DE
Total
(Mdpt-mran)^2 -16.5
6 10 1 8
DEF
26 30 4 28
,
Steps in solving the variance and standard for grouped data
1. Make a table as shown, and find the midpoint of each class.
LC UC f
A B C D
6 10 1
11 15 2 31 35 3
16 20 3 36 40 2
21 25 5
26 30 4 E
2. Multiply the frequency by the midpoint for each class, and place the products in column D.
6 10 1 8
A B C
LC UC f
DE
8
11 15 2 13 26
31 35 3 33
16 20 3 18
36 40 2 38
21 25 5 23
26 30 4 28
54 115 112 99 76
3. Multiply the frequency by the square of the midpoint, and place the products
in column E.
A B C DE
LC UC f
6 10 1 8 8 64
11 15 2 13 26 338
16 20 3 18 54 972
21 25 5 23 115 2645
26 30 4 28 112 3136
31 35 3 33 99 3267
36 40 2 38 76 2888
4. Find the sums of columns B, D, and E. The sum of column B is n, the sum of column D is f
Xm, and the sum of column E is f . The completed table is shown.
A B C DE
LC UC f
6 10 1 8 8 64
11 15 2 13 26 338
16 20 3 18 54 972
21 25 5 23 115 2645
26 30 4 28 112 3136
31 35 3 33 99 3267
36 40 2 38 76 2888
Total 20 ( 13310
https://docs.google.com/spreadsheets/u/3/d/1OwAzwVgqMORVrRBl40Klix0wji2BGuMnFcdef
s PlBT4/edit
Formula
Set 1 94
90
90
90
90
86
84
84
82
82
80
80
74
72
72
72
72
71
70
68
68
66
66
66
64
64
64
64
58
54
SET 2
97
98
96
94
90
89
87
86
84
80
78
75
70
68
65
53
48
46
42
40
35
33
33
32
28
26
25
22
20
29
k= 1, 2, 3…...99
Cf<
21 25 9
26 30 8
31 35 5
36 40 2
Total 40
Item analysis is a process of examining the student’s response to individual items in the test. It
consists of different procedures for assessing the quality of the test items given to the students.
Through item analysis we can identify which of the given are goods and defective test items.
Good items are to be retained and defective items are to be improved , to be revised or to be
rejected.
There are thee common types of quantitative item analysis which provide teacher with
three different types of information about individual test items.These are the difficulty index,
discrimination index and the response option analysis.
Difficulty index refers to the proportion of the number of students in the upper and lower
group who answered an item correctly.
Discrimination index is the power of the item to identify the low performing students
and fast performing students.
To determine the level of difficulty of an item using the range give below
Index range Difficulty level
0.00-0.20 Very Difficult
0.21-0.40 Difficult
0.41-0.60 Moderately difficult
0.61-0.80 easy
0.81-1.00 Very Easy
For determine the level of discrimination , Ebel and Frisbie ( 1985) recomend the use of the
range below.
0.19 and below Poor item, should be eliminated, or revised
0.20-0.29 Marginal iten, needs some revisiosions
0.30-0.39 Reasonably good item but possible for improvement 0.40 and above
Very good item
Upper Group
Lower Group
4. Compute the difficulty index and discrimination index and also the analysis of each
reponse in the distracters
5. Make an analysis,
Computing the difficulty index and discrimination index
`1.
Option A B* C D Difficulty Interpr Remarks
Interpretati etation
Discriminatio
E
Index
on
index
Example of Analysis
a. Only 35 percent of the examinees got the answer correctly, hence the test item is difficult
b. .More students from upper group got the answer answer correctly, hence, it has positive
discrimination.
c. Retain option A, C and E most students because most of the students in the lower group
selected it. Those options attract most students in the lower group.These options are
plausible but incorrect.
Conclusion. Retain the test item but change option D to make it effective (Plausible but
incorrect) for the upper and lower group. At least 5 % of the examinee choose the incorrect
option
2.
Option A B C D* E Difficulty Index Interpre Discrimin Interpret Remark
tat ion atioindex ati on s
3.
Option A B C D E* Difficulty Index Interpre Discrimin Interpret Remarks
tat ion atioindex ati on
Upper group 1 2 3 10 4
5
Lower Group 3 4 4 4
5. Ambiguous Item.( This happens when more students from upper group choose equally an
incorrect option and the keyed ,
Option A B C D E* Difficulty Index Interpre Discrimin Interpret Remarks
tat ion atioindex ati on
Upper group 7 1 1 2 8
6
Lower Group 6 2 3 3
6. Guessing Item ( Students from the upper group have equal spread of choices among the given
alternatives.)
Option A B C* Difficulty Interpre Discrimin Interpret Remarks
DE tat ion atioindex ati on
Index
Upper group 4 3 4 36
Lower Group 3 4 3
45
7.
Option A B C* Difficulty Interpre Discrimin Interpret Remarks
DE tat ion atioindex ati on
Index
Upper 4 3 4 36
group 3 4 3 45
Lower
Group
II.
The first 13 students are the upper group and next 13 students are the lower group
Stude 1 2 3 4 5 6 8 9 10 11 12 13 15
nts
/Item
Number
A 1 0 1 1 111111 0 1 1
B 1 1 1 0 111110 1 1 1
C 1 0 1 1 111110 1 1 1
D 1 1 1 0 111110 0 1 1
E 1 0 1 1 110110 0 1 1
F 1 0 1 0 111101 0 0 1
G 0 1 1 1 011111 1 0 1
H 1 1 0 0 111101 1 0 1
I 0 1 0 1 111100 0 0 1
J 0 1 0 1 110110 0 1 1
K 0 1 1 1 011111 0 1 1
L 0 1 0 1 110101 110110 110101 110110 1 1 1
M 0 1 1 1 010111 110110 1 0 1
N 0 1 0 0 1 0 1
O 0 0 0 0 1 0 1
P 0 0 0 0 1 0 1
Q 1 0 0 0 1 0 1
R 0 0 1 0 010111 1 1 1
S 0 1 1 0 110110 011101 1 1 1
T 1 0 0 0 1 1 1
U 0 0 1 1 110110 1 1 1
V 0 0 1 0 011101 1 0 1
W 0 0 0 0 110110 1 0 1
X 1 0 1 1 011101 1 0 1
Y 1 1 0 1 110111 1 0 1
Z 0 0 0 1 011101 1 0 0
U 7 9 9
L 4 3 5
VI MD MD
IVI margi RG
n
al
Split-half reliability resembles alternate form reliability in that you separate the test into halves and
then compare each half. For example, let’s say that we have a 50-item test. The most common
approach is to split the test into two tests by separating the odd numbered and even-numbered
items.2 First, compute the score that each student obtained on the 25 odd-numbered items (items 1,
3, 5, 7, . . ., 49). Then, compute the score that each student obtained on the 25 even-numbered items
(items 2, 4, 6, 8, . . ., 50). Now you have two scores for each student. Next, you compute the
correlation between the two sets of scores
Alternate form Alternate forms Develop two parallel forms of the same test. Give each test to the same
Group.
Split-half Internal consistency Give the test to a group. Divide the items into odd and even. Correlate the
scores for each half test.
KR-20 and Cronbach’s α Internal consistency Give the test to a group. Compute the KR-21 KR-20 reliability or
Cronbach’s α.
0.71-0.80 Good for classroom test. There are probably few items needs to be
improved
0.51-0.60 Suggest need for revision of test, unless it is quite short(ten or fewer
items)Needs to be supplemented by other measures (
more test) for grading
0.50 and below Questionable reliability. Thus test should not contribute heavily
to the course grade, and it needs revision
1 36 38
2 26 34
3 38 38
4 15 27
5 17 25
6 28 26
7 32 35
8 35 36
9 12 19
10 35 38
Find the
The reliability coefficient using the pearson r 0.91, means that it has a very high reliability. The
score of ten students conducted twice with one-day intervals are consistent. Hence the test has
very high reliability.
Alternate form
Use also the person's product moment correlation coefficient.
`
Two forms of test were administered to ten students. Is the test reliable?
Stud Form Form II
ent s !
1 12 20
2 20 22
3 19 23
4 17 20
5 25 25
6 22 20
7 15 19
8 16 18
9 23 25
10 21 24
Solution
Find:
Split-half Test
Reliability of the half test is computed by using the pearson’s product moment
correlation
1 15 20
2 19 17
3 20 24
4 25 21
5 20 23
6 18 22
7 19 25
8 26 24
9 20 18
10 18 17
Internal Consistency
Where
k is the number of items
p is the proportion of student got the item correctly( index of
difficulty)
q =1 - p
is the variance of total score
C
KR-21
Where
k is the number of item
is the mean
variance of the total score
a 40 item test was administered to 15 students. Find the reliability of the test using
Kuder-Richardson formula
students score
1 16
2 25
3 35
4 39
5 25
6 18
7 19
8 22
9 33
10 36
11 20
12 17
13 26
14 35
15 39
Teacher A administered 20 true or false test for his Math class. Below are the test score of
40 students. Find the reliability coefficient using KR 20 formula and interpret the computed
value,and solve also the coefficient of determination
Item x
Number
1 25
2 36
3 28
4 23
5 25
6 33
7 38
8 15
9 23
10 25
11 36
12 35
13 19
14 39
15 28
16 33
17 19
18 37
19 36
20 25
QUESTIONS
stu 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
de
nts
1 0 1 0 1 1 1 1 111 0100 01 1 0 1 1
2 1 0 0 1 0 0 1 101 1010 10 1 0 1 1
3 1 0 1 1 1 1 1 111 0111 01 1 1 1 1
4 1 1 0 1 1 1 0 111 1100 01 1 1 1 1
5 1 1 1 1 1 1 1 111 1101 11 1 1 1 1
6 1 0 0 0 0 1 0 010 0000 10 1 0 1 1
7 1 0 0 0 0 1 1 011 0010 00 0 0 1 0
8 1 0 0 0 1 0 0 011 0100 00 0 1 1 0
9 1 0 1 1 1 0 0 100 1010 11 0 0 1 0
10 1 1 0 1 1 0 0 111 0111 10 0 1 1 1
11 1 1 1 1 1 1 1 000 1010 10 1 1 0 1
12 1 1 0 0 1 0 0 101 0010 10 1 0 0 1
13 1 0 0 1 0 1 0 101 1110 10 1 0 1 0
14 1 0 0 1 0 0 1 100 0110 10 0 0 1 0
Validity
The most important characteristic of a test is validity Does a test actually measure what
We think it is measuring?
Content-related evidence of validity refers to the match between the test items
and the content that was taught
One way to look at validity is to use the content-related approach. We will be examining
content-related evidence of validity as a general concept. We will also look at three
specifics types of content-related evidence of validity: instructional validity, curricular
validity, and face validity
Instructional validity refers to the match between the items on the test and the
material that was taught
Curricular validity refers to the match between the items on the test and the official
curriculum
If the items on a test appear to be measuring the appropriate skills, and appear to be
appropriate for the students taking the test, the test is said to have face validity
To establish criterion-related validity, you actually have to follow three steps. You give a
test to a group of students. Next, each student is required to perform a series of tasks that
also measure the same skill. Finally, you correlate the test scores with the scores that the
students obtain with the alternative assessment. This correlation coefficient can be
interpreted as a validity coefficient.
Let’s say that you are teaching your 7th-grade students a variety of math skills. At some
point, you will want to see how well they can perform those skills. If you give them
problems to solve demonstrating all of the skills that were taught, it might take them
several hours to be able to demonstrate all of those skills. Instead, you can develop a test
that samples the skills that were taught and can be administered in 40 minutes. Hopefully,
the briefer test will be as effective in measuring their skills as would the longer procedure.
Essentially, you are using the test score as an estimate of how the students would do using
all of the skills taught. In this case the longer procedure is the criterion and, hopefully, the
shorter test demonstrates criterion-related evidence of validity.
Concurrent Validity
Criterion-related evidence of validity comes in two forms, the first of which is known as
concurrent validity. Concurrent validity is demonstrated when a test is correlated with another
measure of the same behavior or skill that is taken at about the same time as the test is given. The
test measures the student’s current skill level
A test has concurrent validity if it displays a positive correlation with another method of
measuring the same behavior or skill given at about the same time.
For another example of concurrent validity, imagine that you are a driver’s education
teacher and that you have developed a paper-and-pencil test of driving skill. After you
administer the written test to your students you also evaluate them on a driving course
where they actually have to deal with simulated driving challenges. If the scores on the
written driver’s test correlate well with the scores that the students received in negotiating
the driving course, then you have demonstrated evidence that your written test has
concurrent validity
Predictive Validity
The other type of criterion-related evidence of validity is known as predictive validity.
We can sometimes use a test to predict future performance. Good examples of this
include the SATs and the ACTs.
A test is said to have predictive validity if it is positively correlated with some future
behavior or skill
Intelligence tests are also expected to have predictive validity. When Alfred Binet
developed the first intelligence test in France in the early 1900s, it was expected to be able
to predict those children who would be successful in school and those who would struggle
with the academic demands of the classroom. Since intelligence tests do predict future
school performance, that is the primary reason for using them in school With criterion-
related evidence of validity we are correlating a test with some other
measure. Therefore, the correlation coefficient is frequently interpreted as a validity
coefficient.
Construct-Related Evidence of Validity
A measurement device is said to display construct-related evidence of validity if it measures what
the appropriate theory says that it should be measuring.
Researchers compute a test value from the sample data to decide whether the null
hypothesis should be rejected. Statistical tests can be one-tailed or two-tailed,
depending on the hypotheses.
The null hypothesis is rejected when the difference between the population
parameter and the sample statistic is said to be significant. The difference is
significant when the test value falls in the critical region of the distribution. The
critical region is determined by the level of significance of the test. The level is
the probability of committing a type I error. This error occurs when the null
hypothesis is rejected when it is true. Three generally agreed upon significance
levels are 0.10, 0.05, and 0.01.
Where
is the the observed difference between sample means
where the expected value which is equal to zero
Assumption for the T test for independent means when the variance of population
are unknown
Example;
Can it be concluded that data below are different at 0.05 level of significance?
GROUPS A B
Sample size 8 10
Solution
Step I. State the hypothesis and identify the claim for the mean
There is no significant difference between the group A and groupB
There is significant difference between the group A and groupB
Step 2: Find the critical values. Since the test is two-tailed, since a 0.05, and since the
variances are unequal, the degrees of freedom are the smaller of n -1 or n -1. In 1 2
this case, the deg rees of freedom are 8- 1= 7. Hence, from
Step 4Make the decision. Do not reject the null hypothesis, since - 0.57> -2.365.
Step 5. Summarize
the results. There is not enough evidence to support the claim that the
The mean of the two groups are not different.
Try This
1. Hours Spent Watching Television According to Nielsen Media Research, children
(ages 2–11) spend an average of 21 hours 30 minutes watching television per week
while teens (ages 12–17) spend an average of 20 hours 40 minutes. Based on the
sample statistics obtained below, is there sufficient evidence to conclude a difference
in average television watching times between the two groups? Use a 0.01.
Children Teens
Sample size 15 15
2. Test the claim that the means are different. Test at 0.05 level of significant
Group I Group II
When the samples are dependent, a special t test for dependent means is used.
This test employs the difference in values of the matched pairs. The hypotheses
are as follows:
Whe
Example
Try this….
1. Retention Test Scores A sample of non-English majors at a selected college was used in
a study to see if the student retained more from reading a 19th-century novel or by
watching it in DVD form. Each student was assigned one novel to read and a different
one to watch, and then they were given a 20-point written quiz on each novel. The test
results are shown below. At a 0.05, can it be concluded that the book scores are higher
than the DVD scores?
BOOK 90 80 90 75 80 90 84
DVD 85 72 80 80 70 75 80
2. Improving Study Habits As an aid for improving students’ study habits, nine students
were randomly selected to attend a seminar on the importance of education in life. The
table shows the number of hours each student studied per week before and after the
seminar. At a 0.10, did attending the seminar increase the number of hours
the students studied per week?
Before 9 12 6 15 3 18 10 13 7
After 9 17 9 20 2 21 15 22 6
To answer the first two questions, statisticians use a numerical measure to determine whether two
or more variables are linearly related and to determine the strength of the relationship between or
among the variables. This measure is called a correlation coef icient.
6 82 2 63 1 57
F
D
E
5 88 2 68 3 75
D 5
students Hour/s of
StudyE 2
A 6
F 3
B 2
C 1
Grade 82
63
57 68
88 75
A scatter plot is a graph of the ordered pairs (x, y) of numbers consisting of the
independent variable x and the dependent variable y.
The scatter plot is a visual way to describe the nature of the relationship between the
independent and dependent variables. The scales of the variables can be different, and
the coordinates of the axes are determined by the smallest and largest data values of
the
variables.
Examples
A bivariate normal distribution means that for the pairs of (x, y) data values, the corresponding y
values have a bell-shaped distribution for any given x value, and the x values for any given y
value have a bell-shaped distribution.
Formally defined, the population correlation coefficient r is the correlation computed by using
all possible pairs of data values (x, y) taken from a population.
When the null hypothesis is rejected at a specific level, it means that there is a significant
difference between the value of r and 0. When the null hypothesis is not rejected, it means that
the value of r is not significantly different from 0 (zero) and is probably due to chance.
An educator wants to see how the number of absences for a student in her class affects the
student’s final grade. The data obtained from a sample are shown.
Number of absences(x) 10 12 2 0 8 5
Final grade (y) 70 65 96 94 75 82
What is the final grade if the student have 4 absences? 6 absences?
( ) ∑��2
( )− ∑��
( ) ∑����
∑��
()
�� = or
2 �� = �� − ����
�� ∑��2
( )− ∑��
()
( )− ∑��
( ) ∑��
�� = ()
�� ��
∑���� ∑��2
( )−(��)2
Where a is the y’ intercept and b is the slope of the line
y’= a +bx
Activity
SAT Scores Educational researchers desired to find if a relationship exists between the average
SAT verbal score and the average SAT mathematical score. Several states were randomly
selected, and their SAT average scores are recorded below. Is there sufficient evidence to
conclude a relationship between the two scores?
Verbal x
526 504 594 585 503 589
Math y
530 522 606 588 517 589
For example .
Suppose a market analyst wished to see weather consumers have any preference among the five
flavors of a new fruit soda.
Cherry Strawberry Orange Lime Grape
32 28 16 14 10
If there is no preference, one would expect each flavor be selected with equal frequency
Part of the table for chi-square distribution. Degree of freedom ids defined as number of
categories minus one that there is preference in the selection of fruit soda flavor
Is there enough evidence to reject the claim that there is no preference in the selection of fruit
soda flavor .
Cherry Strawberry Orange Lime Grape
32 28 16 14 10
Solution
Step 1 State the hypotheses and identify the claim.
H0: Consumers show no preference for flavors (claim).
H1: Consumers show a preference.
Step 2 Find the critical value. The degrees of freedom are 5 - 1 = 4, and a = 0.05. Hence, the
critical value from Table G in Appendix C is 9.488.
Step 3 Compute the test value by subtracting the expected value from the corresponding
observed value, squaring the result and dividing by the expected value, and finding the sum. The
expected value for each category is 20, as shown previously
1 18 24
2 17 28
3 14 30
4 13 26
5 12 22
6 18 18
7 8 15
8 8 12
F- test
With the F test, two different estimates of the population variance are made. The first
estimate is called the between-group variance, and it involves finding the variance of
the means. The second estimate, the within-group variance, is made by computing the
variance using all the data and is not affected by differences in the means.
However, when the means differ significantly, the between-group variance will be much
larger than the within-group vari-ance; the F test value will be significantly greater than 1;
and the null hypothesis will be rejected. Since variances are compared, this procedure is
called analysis of variance(ANOVA).
For a test of the difference among three or more means, the following hypotheses
should be used:
�� : =
0µ1 = μ2µ3 = . . . = μ��
H1: At least one mean is different from the others.
The degrees of freedom for this F test are d.f.N. k -1, where k is the number of groups, and
d.f.D. N - k, where N is the sum of the sample sizes of the groups
�� = ��1 + ��2 + .. . ����
the sample sizes need not be equal. The F test to compare means is always right-tailed.
A researcher wishes to try three different techniques to lower the blood pressure of individuals
diagnosed with high blood pressure. The subjects are randomly assigned to three groups; the
first group takes medication, the second group exercises, and the third group follows a special
diet. After four weeks, the reduction in each person’s blood pressure is
recorded. At a 0.05, test the claim that there is no difference among the means. The data are
shown.
Medication Exercise Diet
10 6 5
12 8 9
9 3 12
15 0 8
13 2 4