0% found this document useful (0 votes)
28 views

Simple Regression and Correlation MEE

This document is a report on regression and correlation submitted by five students to their professor Mian Abbas. It discusses simple linear regression, correlation, and the Pearson's r correlation coefficient. It provides formulas to calculate the slope and intercept of a regression line, and the Pearson's r value. An example using data on soda consumption and bathroom trips is worked through to demonstrate these statistical techniques.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Simple Regression and Correlation MEE

This document is a report on regression and correlation submitted by five students to their professor Mian Abbas. It discusses simple linear regression, correlation, and the Pearson's r correlation coefficient. It provides formulas to calculate the slope and intercept of a regression line, and the Pearson's r value. An example using data on soda consumption and bathroom trips is worked through to demonstrate these statistical techniques.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

REPORT

ON

REGRESSIOIN AND CORELLATION

SUBMITTED TO:

SIR MIAN ABBAS

SUBMITTED BY:

Doda Rasheed MB-SI-09-115

Shahbaz Hussain MB-SI-09-021

Muzamil Hussain Khan MB-SI-09-067

Irfan Sadiq MB-SI-09-023

M. Jahanzeb Abbasi MB-SI-09-049

Institute of Management Sciences


BAHAUDDIN ZAKARIYA UNIVERSITY, MULTAN.
Acknowledgement

I like to express sincere gratitude and heartful thanks to my guide Mr. Mian
Abbas for his encouragement given to me at every stage of work. His kind nature
and knowledgeable guidance has given me spirit to fulfill my work in easier way. I
am immensely indebted to him for his affection and help rendered to me in several
ways. I am very grateful for his timely attention, criticism and for being a constant
source of inspiration. He constantly motivated me to step forward without being
dissipated by frolics and failure.
Simple Regression and Correlation
We are going to discuss a powerful statistical technique for examining whether or
not two variables are related. Specifically, we are going to talk about the ideas of
simple regression and correlation.
One reason why regression is powerful is that we can use it to demonstrate
causality; that is, we can show that an independent variable causes a change in a
dependent variable.
Scatter grams
The simplest thing we can do with two variables that we believe are related is to
draw a scatter gram. A scatter gram is a simple graph that plots values of our
dependent variable Y and independent variable X. Normally we plot our dependent
variable on the vertical axis and the independent variable on the horizontal axis.

For example, let’s make a scatter gram of the following data set:
Individual No. of Sodas Consumed No. of Bathroom Trips
Rick 1 2
Janice 2 1
Paul 3 3
Susan 3 4
Cindy 4 6
John 5 5
Donald 6 5

Eyeballing a regression line

Just from our scatter gram, we can sometimes get a fairly good idea of the
relationship between our variables. In our current scatter gram, it looks like a line
that slopes up to the right would “fit” the data pretty well.
Essentially, that’s the entire math we have to do: figure out the best fit line, i.e. the
line that represents an “average” of our data points.

Note that sometimes our data won’t be linearly related at all; sometimes, there may
be a “curvilinear” or other nonlinear relationship. If it looks like the data are
related, but the regression doesn’t “fit” well, chances are this is the case.

Simple linear regression


While our scatter gram gives us a fairly good idea of the relationship between the
variables, and even some idea of how the regression line should look, we need to
do the math to figure out exactly where it goes.
To figure it out, first we need an idea of the general equation for a line. From
algebra, any straight line can be described as:

Y = a + bX, where a is the intercept and b is the slope

Figuring out a and b

The problem of regression in a nutshell is to figure out what values of a and b to


use. To do that, we use the following two formulas:

a = ¯y − bx

And Again, this looks ugly, but it’s all the same simple math we already know and
love: just use PEMA, and we will get the right answer.

Solving our example


So, let’s revisit our example data and figure out the slope and intercept for the
regression line.

Individual No. of Sodas Consumed No. of Bathroom Trips


Rick 1 2
Janice 2 1
Paul 3 3
Susan 3 4
Cindy 4 6
John 5 5
Donald 6 5

Solving our example (cont’d)


Now that we have calculated b, calculating a is pretty simple; we just solve

a = (26/7)−0.8387(24/7)
= 3.7142−(0.8387)(3.4285)
= 3.7142 − 2.8754 = 0.8388.

Pearson’s r

Now that we’ve found a and b, we know the intercept and slope of the regression
line, and it appears that X and Y are related in some way. But how strong is that
relationship?
That’s where Pearson’s r comes in. Pearson’s r is a measure of correlation;
sometimes, we just call it the correlation coefficient. R tells us how strong the
relationship between X and Y is.

Calculating Pearson’s r

The formula for Pearson’s r is somewhat similar to the formula for the slope (b). It
is as follows:
We already know the numerator from calculating the slope earlier, so the only hard
part is the denominator, where we have to calculate each square root separately,
and then multiply them together.

Solving for our example

So, from our example:

Correlations and determinations


A correlation coefficient of around .8 indicates that the two variables are fairly
highly associated. If we square r, we get the coefficient of determination r2, which
tells us how much of the variance in Y is explained by X. In this case, r2 = .6412,
which means that we estimate that 64% of the variance is explained by X, while
the remainder is due to error.
The only other thing we might want to do is find out if the correlation is
statistically significant. Or, to put it in terms of a null hypothesis, we want to test
whether H0 : r = 0 is true.

Significance testing for t

To test whether or not r is significantly different from zero, we use the t test for
Pearson’s r:
Since this is like any other hypothesis test, we want to compare tob with tcrit. For
this test, we use our given alpha level (conventionally, .05 or .01) and df = n − 2.
In this case, we subtract 2 from our sample size because we have two variables.

So, with _ = .05, is the correlation significant?

Example significance test

Now, like in other significance tests, we find our critical value of t from the table
(_ = .05, df = 5 : 2.571) and compare it to the obtained value. Since 2.571 _ 2.9893,
we reject the null hypothesis and conclude that the correlation is statistically
significant.

You might also like