Statistics and Probability: Quarter 4 - Module 7 Bivariate Data
Statistics and Probability: Quarter 4 - Module 7 Bivariate Data
Department of Education
Region I
SCHOOLS DIVISION OF ILOCOS NORTE
Statistics and
Probability
Quarter 4 – Module 7
Bivariate Data
MELC:
Illustrates the nature of bivariate data.
Construct a scatter plot.
Describe shapes (form), trend (direction), and
variation (strength) based on the scatter plot.
(K to 12 BEC CG: M11/12SP-IV-g2-g4)
Prepared by:
MARISSA G. AREOLA
Teacher I
Bangui National High School
Probability and Statistics - Grade 11
Share-A-Resource-Program
Quarter 4 – Module on Bivariate Data
First Edition, 2020
Republic Act 8293, section 176 states that: No copyright shall subsist in
any work of the Government of the Philippines. However, prior approval of the
government agency or office wherein the work is created shall be necessary for
exploitation of such work for profit. Such agency or office may, among other things,
impose as a condition the payment of royalties.
Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand
names, trademarks, etc.) included in this book are owned by their respective
copyright holders. Every effort has been exerted to locate and seek permission to use
these materials from their respective copyright owners. The publisher and authors
do not represent nor claim ownership over them.
Pre-test is provided to measure your prior knowledge on the lesson. This will
show you if you need to proceed in completing this module or if you need to ask your
facilitator or your teacher’s assistance for better understanding of the lesson. At the
end of this module, you need to answer the post-test to self-check your learning.
Answer keys are provided for all activities and tests. We trust that you will be honest
in using them.
In addition to the material in the main text, Notes to the Teacher is also
provided to our facilitators and parents for strategies and reminders on how they can
best help you in your home-based learning.
Please use this module with care. Do not put unnecessary marks on any part
of this CLM. Use a separate sheet of paper in answering the exercises and tests.
Likewise, read the instructions carefully before performing each task.
If you have any question in using this CLM or any difficulty in answering the
tasks in this module, do not hesitate to consult your teacher or facilitator.
Thank you.
ii
What I Need to Know
This module was designed and written with you in mind. It is here to help you
master bivariate data. The scope of this module permits it to be used in many
different learning situations. The language used recognizes the diverse vocabulary
level of students. The lessons are arranged to follow the standard sequence of the
course. But the order in which you read them can be changed to correspond with
the textbook you are now using.
1
What I Know
This part includes an activity that aims to check what you already know about
the lesson so let us have some fun. Read each question carefully and choose the
letter of the best answer. Write the chosen letter on a separate sheet of paper and
submit a copy of it to your subject teacher.
1. Which one of the following is NOT appropriate for studying the relationship
between two quantitative variables?
a. Scatterplot b. Bar chart c. Correlation d. Regression
6. If all the dots of a scatter diagram lie on a straight line falling from left
bottom corner to the right upper corner, the correlation is called _________.
a. Zero correlation
b. Perfect negative correlation
c. High degree of positive correlation
d. Perfect positive correlation
7. When the values of two variables move in the opposite directions, correlation
is said to be ____________________.
a. Linear c. Positive
b. Non-linear d. Negative
2
8. The relationship between the variables x and y is shown on the scatterplot at
right. The correlation between x and y would be best described as:
a. a weak positive association
b. a weak negative association
c. a strong positive association
d. a strong negative association
10. Which of the following statements best describes by the graph below;
8
6
Alcohol
4
2
0
0 1 2 3 4 5
Tobacco
3
Lesson
Bivariate Data
1
When one measurement is made on each observation, univariate analysis is
applied. If more than one measurement is made on each observation, multivariate
analysis is applied. In this section, we focus on bivariate analysis, where exactly two
measurements are made on each observation. So let’s find out more about this.
What’s In
Getting To Know You!
Directions: Read each statement carefully. Choose your answer from the given
answer pool below. Write your answer on the space provided before each number.
Dependent variable
Independent variable Two-way
Categorical frequency
Quantitative table
Bivariate data
1. This can be used to show the relationship between two categorical variables.
2. It measures an outcome of a study.
3. It may explain or influence changes in a response variable
4. It shows the relationship between two quantitative variables measured on the
same individual.
5. It involves quantities which are measurable or countable.
4
Note the Teachers
The teacher must consider the prerequisite skills needed in
the development of this competency including the schema or
background knowledge which may reinforce learning. This module
will help the learners bridge the gap of learning to attain mastery
of the lesson in its spiral progression.
What is New
Vocabulary:
What is It
Bivariate Data
Bivariate data are ubiquitous in all fields of scientific research. They consist
of paired measurements on two quantitative variables, say X and Y. That is, each
observation, i, in a data set containing n observations possesses numerical
values xi and yi. Thus, bivariate data deals with two variables that can change and
are compared to find relationships. If one variable is influencing another variable,
then you will have bivariate data that has an independent and a dependent variable.
5
This is because one variable depends on the other for change. An independent
variable (explanatory) is a condition or piece of data in an experiment that can be
controlled or changed. A dependent variable (outcome) is a condition or piece of
data in an experiment that is controlled or influenced by an outside factor, most
often the independent variable.
This is very different from univariate data, which is one variable in a data set that
is analyzed to describe a scenario or experiment.
For example, if Mindy was studying for a quarter tests and tracks her study
time and her test scores, she might see that the more time she spends studying, the
better her test scores become. Therefore, in this scenario, Mindy's test scores are the
dependent variable because they depend on the number of hours she studies.
Likewise, the number of study hours would be considered the independent variable.
For that reason, we can see the relationship in this bivariate data set
There are ways of displaying the data and of measuring relationships between
the two variables. The methods we employ to do this depend on the type of variables
we are dealing with; that is, they depend on whether the data are numerical or
categorical.
6
Parallel boxplot - display a relationship between a numerical variable
and a categorical variable with two or more categories.
Segmented bar charts - powerful visual aid for comparing and examining the
relationship between two categorical variables.
7
What’s More
To Be With You More!
Directions: Determine what type and how many variables were used in each graph.
Write your answer in the right column.
Graph Variables
1.
2.
3.
4.
8
5.
1. The
teacher
records
students
grades
on
a
test
and
the
number
of
days
until
their
next
birthday
and
wants
to
know
if
there
is
a
relationship.
Student Test Grade Days Until Birthday
Angelo 100 10
Bryan 82 300
Cecille 97 254
Daniel 77 28
Elena 84 211
a. What
type
of
data
is
this?
b. What
would
you
use
to
display
it?
9
a. Which
data
sets
are
categorical?
b. Which
data
sets
are
numerical?
c. Choose
one
set
of
data
to
display
in
a
box
plot
d. Choose
one
set
of
data
to
display
in
a
bar
graph.
e. Choose
one
set
of
data
to
compare
in a
Scatter
Plot.
What’s In
Data Match Relation Activity
Directions: Write the letter of the correct answer in each item.
10
6. What do scatterplots display? f. If we have data on variables x/y for n
individuals. The values for the first
individual are (x1 y1), the values for the
second individual are (x2, y2) and so on.
What is New
KNOW THESE!
Of all the well-known graphical devices used today for the display of
quantitative data, the most ubiquitous, at least in popular presentation graphics—
pie charts, line graphs, and bar charts—in their modern form, are generally
attributed to William Playfair1(1759–1823). All of these were essentially one-
dimensional.
The next major invention, and the first true two-dimensional one, is the
scatterplot. Indeed, among all the forms of statistical graphics, the humble
scatterplot may be considered the most versatile, polymorphic, and generally useful
invention in the entire history of statistical graphics. Tufte (1983) estimated that
between 70 and 80 percent of graphs used in scientific publications are scatterplots.
11
What is It
SCATTER PLOT
Scatter plot are diagrams that are used to show the degree and pattern of
relationship between the two sets of data. For instance, a researcher wants to finds
out if there is a relationship between height and weight. Here height is the
independent variable and weight is the dependent variable. If a person gets taller, his
weight may increase but an increase in his weight will not make the person taller.
But this does not mean that this variable causes the other variable, it simply means
that there is association between the two. The data will be constructed on the xy
coordinate plane. Each data point on a scatter plot represents two values (x, y). The
abscissa of the point is a value of the independent variable (x) and the ordinate is a
value of the dependent variable. Using scatter plots, we can identify the relationship
between two attributes, clusters of points and outliers.
Outliers: Do there appear to be any data points that are unusually far away from
12
Example:
The table below shows the time in hours (x) spent by six grade 11 students in
studying their modules and their scores (y) on a test. Construct a scatter plot.
40
30
Score
20
10
0
0 1 2 3 4 5 6
time
The points plotted on the x-y coordinate plane seem to follow a straight line that
point upward to the right. This indicates that the two variables are to some extent
linearly related and the relationship between variables is positive. The scatterplot
represents a positive correlation. It describes a trend since as the amount spent in
studying increases their scores also increases.
13
DRAWING CONCLUSIONS/CAUSATION
When data are graphed, we can often estimate by eye (rather than measure) the type
of correlation involved. Our ability to make these qualitative judgments can be seen
from the following examples, which summarize the different types of correlation that
might appear in a scatterplot.
14
Weak positive linear relationship-
the dots are widely spread Weak negative linear relationship –dots
are widely spread
No relationship- there is no
form, no linear trend, just a
random placement of points.
Note: In linear relationships, the trend in the data is best described by a straight line.
That is, we could fit a straight line in the center of the scatterplot to indicate the trend
in the data.
Line of best fit
15
What’s More
(1)
A scatter plot diagram is a graph that shows the _____________ between two
(2) (3)
______________variables. It uses _____________ to represent values for two numeric
variables. Scatter plot not only report the values of individual data points, but also
___________ when
(4) the data are taken as a whole. It is further apply the best suitable
___________ analysis
(5) technique.
16
Scatter Plot Description
1 distance travelled in km 60
50
40
30
20
10
0
0 2 4 6 8
time in hours
3 300
250
200
Altitude
150
100
50
0
0 20 40 60
Temperature
2000
1500
1000
500
0
0 5 10 15 20
17
5 160
Wieght in Lbs
150
140
130
120
0 2 4 6 8
No. of hours Exercise in a week
What I Can Do
Directions: Read each problem carefully and answer the question/s that follow/s.
1. Jayson plays basketball for his high school at Bangui. She wants to improve to
play at the college level. He notices that the number of points he scores in a game
goes up in response to the number of hours he practices his jump shot each week.
He records the following data.
No. of hours practicing jump 4 5 7 10 11 12
shot
Points scored in a game 12 15 21 28 33 37
A. Construct a scatter plot and state if what Jayson thinks appear to be true.
2. The population of some municipalities in Ilocos Norte (to the nearest thousand)
together with the number of primary schools in that particular municipality is given
below for 10 municipalities.
Population 35 29 15 24 3 33 32 25 38
No. of Primary 16 13 7 12 2 15 15 12 18
school
18
Assessment
Multiple Choice. Choose the letter of the best answer. Write the chosen letter on a
separate sheet of paper.
1. Which one of the following variables is not categorical?
a. Age of a person.
b. Gender of a person: male or female.
c. Choice on a test item: true or false.
d. Marital status of a person (single, married, divorced, other)
5. When the values of two variables move in the same direction, correlation is
said to be _____________.
a. Linear c. Positive
b. Non-linear d. Negative
6. If all the points of a scatter diagram lie on a straight line falling from left
upper corner to the right bottom corner, the correlation is called ___________.
a. Zero correlation
b. Perfect negative correlation
c. High degree of positive correlation
d. Perfect positive correlation
19
a. Show the number of days the symptoms persisted on the x-axis, as this
is the independent variable and the daily dosage of vitamin C on the y-
axis, as this is the dependent variable.
b. Show the daily dosage of vitamin C on the x-axis, as this is the dependent
variable and the number of days the symptoms persisted on the y-axis,
as this is the independent variable.
c. Show the number of days the symptoms persisted on the x-axis, as this
is the dependent variable and the daily dosage of vitamin C on the y-
axis, as this is the independent variable.
d. Show the daily dosage of vitamin C on the x-axis, as this is the
independent variable and the number of days the symptoms persisted on
the y-axis, as this is the dependent variable.
8. Choose the scatterplot that best fits this description: “There is a moderately
strong negative linear association between the two variables with a few potential
outliers”
15 15
a. 10 c. 10
5 5
0 0
0 5 10 0 5 10
20
8
b. 15 d. 6
10 4
5 2
0 0
0 5 10 15 0 5 10 15
9. Mrs. Roque made a scatter plot to compare the number of questions each
student missed on their pre-test and their post-test, in Mathematics as shown in the
graph; # of Questions missed
10
How many of Mrs. Roque’s 10 students 8
Post test
20
10. Which statement is supported by the data shown below?
Hours of Sleep and Grade Point Average (G.P.A.)
Student Hours of sleep per G.P.A.
night
Riza 7.2 3.3
Star 4.5 2.1
John 5.9 2.7
James 5.1 2.5
Merry 6 3.0
Yam 7.5 3.6
Dem 7.0 3.5
Additional Activities
Data Construction Analysis Activity
a. Using a meter stick or ruler, measure the length of the arm spam and height of
your 7 neighbour, ages 10 years old.
b. Construct 10 squares. Find the side of each square. Measure their perimeter.
QUESTIONS:
21
E. For each of the pairs of variables, indicate whether the second variable
would increase or decrease in response to an increase in the value of the
first variable.
Rubric
Point Descriptor
4 The variables are correctly identified. It is labelled accurately in the scatterplot
diagram and each individual points is properly illustrated as shown in the table of
values. All questions are answered completely.
3 The variables are correctly identified but was not shown in the scatterplot diagram
and each individual points is properly illustrated yet incomplete as shown in the
table of values. All questions are answered completely
2 The variables are correctly identified but was not shown in the scatterplot diagram
and each individual points is properly illustrated yet incomplete as shown in the
table of values. All questions are answered completely but some important details
are not indicated.
1 The variables are not correctly identified and not shown in the scatterplot diagram
and each individual points is properly illustrated yet incomplete as shown in the
table of values. All questions are answered completely but some important details
are not indicated.
22
Answer Key
What’s In
23
24
What I Can Do
1. A. Yes, Jayson’s assumption is true the more the number of hours practicing jump
shot, the more the points scored in a game.
2. The relationship of the number of population to the number of primary school is
directly proportional, as one variable increases, the other variable also increases.
What I Have Learned
1. Strong positive linear association between the two variables. As the amount of time
increases the distance travelled also increases.
2. There is no pattern, thus there is no association between the two variables.
3. Strong negative linear association between the two variables. As the temperature
increases, altitude decreases.
4. Weak positive linear association between the two variables. As the amount of time
in working increases, the amount of money also increases.
5. Perfect negative linear association between the two variables. As you increase the
amount of time in doing exercise the weight decreases.
References
Baccay, Elisa S.,Belecina Rene R. and Efren B. Mateo. (2016) Statistics and
Probability First Edition. Quezon City: Rex Printing Company, Inc.
https://quizlet.com/46152467/match
https://quizlet.com/203364351/match
https://www.ics.uci.edu/~jutts/8/SampleMT1MCKey.pdf
http://rubinmath.weebly.com/uploads/3/8/9/1/38911697/scatter_plot_practice.pdf
https://betterlesson.com/lesson/629922/representing-bivariate-data-sets
http://www.shodor.org/interactivate/lessons/UnivariateBivariateData/
http://www.math.yorku.ca/people/georges/Files/NATS1500/Tests/Sample_Final_wi
th_Solutions.pdf
https://study.com/academy/practice/quiz-worksheet-bivariate-data.html
http://dept.stat.lsa.umich.edu/~kshedden/Courses/Stat401/Notes/
401-bivariate-slides.pdf
https://www.researchgate.net/publication/323196619_How_to_desc
ribe_bivariate_data
https://study.com/academy/lesson/what-is-bivariate-data-
definition-examples.html
https://www.hawkermaths.com/uploads/7/7/3/8/77386549/bivari
ate_data_chap_2.pdf
https://www.researchgate.net/publication/7923211_The_Early_Origi
ns_and_Development_of_the_Scatterplot
http://sdeuoc.ac.in/sites/default/files/sde_videos/Quantitative%20
Techniques%20for%20Business%20Decision.pdf
Office Address : Brgy. 7B, Giron Street, Laoag City, Ilocos Norte
Telefax : (077) 771-0960
Telephone No. : (077) 770-5963, (077) 600-2605
E-mail Address : [email protected]
Feedback link: : https://bit.ly/sdoin-clm-feedbacksystem