Linear Regression Analysis_1

The document provides an overview of linear regression analysis, focusing on measures of association, particularly Pearson's correlation coefficient, which quantifies the linear relationship between quantitative variables. It discusses regression analysis as a method for estimating relationships between dependent and independent variables, emphasizing its use for prediction and causal inference. Additionally, it outlines the components of regression models and the process of estimating parameters using methods like ordinary least squares.

Uploaded by

raisa.mim17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Linear Regression Analysis_1

Uploaded by

raisa.mim17

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Linear Regression Analysis

Lecture 1
Variable
Variable
Qualitative and Quantitative variables
Association between Quantitative Variables
Correlation
Regression
Measure of association
• In statistics, any measure used to quantify a relationship between two
or more variables is a measure of association.
• Measures of association are used in various fields of research. For
example, in the areas of epidemiology and psychology, measures of
association are frequently used to quantify relationships between
exposures and diseases or behaviors.
• Data may be measured on an interval/ratio scale, an ordinal/rank
scale, or a nominal/categorical scale.
• These three characteristics can be thought of as continuous, integer,
and qualitative categories, respectively.
• The method used to determine the strength of an association
depends on the characteristics of the data for each variable.
Pearson’s correlation coefficient

• Pearson’s correlation coefficient, r (rho, the population characteristic)

measures the strength of the linear relationship between two variables on
a continuous scale.
• A typical example for quantifying the association between two quantitative
variables (measured on an interval/ratio scale) is the analysis of
relationship between a person’s height and weight.
• Each of these two characteristic variables is measured on a continuous
scale.
• The appropriate measure of association for this situation is Pearson’s
correlation coefficient, r (ρ, the population characteristic), which measures
the strength of the linear relationship between two variables on a
continuous scale.
Estimate of correlation coefficient
• The correlation is estimated by sample correlation r given in the
expression below:
𝑛 𝑥 𝑛 𝑦
𝑛
𝑥 𝑦 − 𝑖=1 𝑖 𝑖=1 𝑖
𝑖=1 𝑖 𝑖 𝑛
𝑟=
( 𝑛 𝑥 )2 ( 𝑛 𝑦 )2
𝑛
𝑥 2 − 𝑖=1 𝑖 𝑛
𝑦 2 − 𝑖=1 𝑖
𝑖=1 𝑖 𝑛 𝑖=1 𝑖 𝑛

• Here we have the sample covariance between the two variables

divided by the square root of the product of the individual variances.
Pearson’s product moment correlation
coefficient
• The coefficient r ranges from −1 to +1 inclusive.
• Values of −1 or +1 indicate a perfect linear relationship between the
two variables, whereas a value of 0 indicates no linear relationship.
(Negative values simply indicate the direction of the association,
whereby as one variable increases, the other decreases.)
• Correlation coefficients that differ from 0 but are not −1 or +1
indicate a linear relationship, although not a perfect linear
relationship.
• In practice, ρ (the population correlation coefficient) is estimated by r,
which is the correlation coefficient derived from sample data.
• Although Pearson’s correlation coefficient is a measure of the
strength of an association (specifically the linear relationship), it is not
a measure of the significance of the association.
• The significance of an association is a separate analysis of the sample
correlation coefficient, r, using a t-test to measure the difference
between the observed r and the expected r under the null hypothesis.
Inferences for Correlation
• Let us consider testing the null hypothesis that there is zero
correlation between two variables Xj and Xk. Mathematically we write
this as shown below:
• H0: ρ= 0 against Ha: ρ≠ 0
• To test the null hypothesis, we form the test statistic, t as below
𝑛−2
𝑡=𝑟 ∼ tn−2
1−𝑟 2
• Under the null hypothesis, H0, this test statistic will be approximately
distributed as t with n - 2 degrees of freedom.
Regression Analysis
Regression is a set of statistical processes for estimating the relationships between a dependent and one (or
more) independent variable(s).

Dependent variable , outcome, response

Independent variable, predictor, covariate, explanatory variable.
Linear Regression
The most common form of regression analysis is linear regression in
which one finds the line that most closely fits the data according to a
specific mathematical criterion.
Usage
Regression analysis is primarily used for two conceptually distinct purposes.
1. Prediction or forecasting
2. In some situations, regression analysis can be used to infer causal relationships between the independent
and dependent variables.
Importantly, regressions by themselves only reveal relationships between a dependent variable and a
collection of independent variables in a fixed dataset.
To use regressions for prediction or to infer causal relationships, respectively, a researcher must carefully justify
why existing relationships have predictive power for a new context or why a relationship between two
variables has a causal interpretation. The latter is especially important when researchers hope to estimate
causal relationships using observational data.
• The earliest form of regression was the method of least squares published by Legendre in 1805 and by Gauss
in 1809.
• Legendre and Gauss both applied the method to the problem of determining, from astronomical
observations, the orbits of bodies about the Sun (mostly comets, but also later the then newly discovered
minor planets).
• The term "regression" was coined by Francis Galton in the 19th century to describe a biological
phenomenon that the heights of descendants of tall ancestors tend to regress down towards a normal
average (a phenomenon also known as regression towards the mean.
• For Galton, regression had only a biological meaning, but later Yule and Karl Pearson extended his idea to a
more general statistical context.
Regression Model
In practice, researchers first select a model they would like to estimate and then use their chosen method (e.g.,
ordinary least squares) to estimate the parameters of that model.
Regression models involve the following components:

• The unknown parameters, often denoted as a scalar or vector .

• The independent variables, which are observed in data and are often denoted as a scalar or vector
Xi (where i denotes a row of data).
• The dependent variable, which are observed in data and often denoted by Yi .
• The error terms, which are not directly observed in data and are often denoted by ei.
Most regression models propose that Yi is a function of Xi and , with ei representing an additive error
term that may stand in for un-modeled determinants of Yi or random statistical noise:

Yi = f(Xi, ) + ei .

The researchers' goal is to estimate the function f that most closely fits the data. To carry out regression
analysis, the form of the function f must be specified.
Linear Regression
In linear regression, the model specification is that the dependent variable, yi is a
linear combination of the parameters (but need not be linear in the independent
variables).
For example, in simple linear regression for modelling, n data points there is one
independent variable: Xi and two parameters 0 and 1 :
Straight line: yi = 0 + 1Xi + i , i = 1, 2, …, n.
In multiple linear regression, there are several independent variables or functions
of independent variables. Adding a term in xi2 to the preceding regression gives:
Parabola: yi = 0 + 1Xi + 2Xi2 + I, i = 1, 2, …, n.
This is still a linear regression although the expression on the right hand side is
quadratic in the independent variable Xi, it is linear in the parameters 0, 1 and 2.
In both cases, I is an error term and the subscript i indexes a particular
observation.
Given a random sample from the population, we estimate the
population parameters and obtain the sample linear regression model

𝑦𝑖 = 𝛽0 +𝛽1 𝑥𝑖 .
The residual ei is the difference between the value of the dependent
variable predicted by the model above and the true value of the
dependent variable.
One method is to obtain parameter estimates that minimize the sum of
squared residuals, SSR.
• What is the formula for the least square estimates for simple linear
regression?
• What is the estimate of the variance?
• What is MSE?
• What are the assumptions of a simple linear regression model?

Week 4 Project: Case Study
No ratings yet
Week 4 Project: Case Study
2 pages
Regression Analysis
No ratings yet
Regression Analysis
12 pages
Week Two Assignment, Econometrics
No ratings yet
Week Two Assignment, Econometrics
4 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
Ra Web
No ratings yet
Ra Web
70 pages
REGRESSION and CORRELATION ANALYSIS STA 106 -DR. BASHIRU
No ratings yet
REGRESSION and CORRELATION ANALYSIS STA 106 -DR. BASHIRU
10 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
CH 6
No ratings yet
CH 6
42 pages
DISCRETE MATH Chapter-8
No ratings yet
DISCRETE MATH Chapter-8
34 pages
Regression: Simple Linear Regression Model
No ratings yet
Regression: Simple Linear Regression Model
16 pages
Final Project: Raiha, Maheen, Fabiha Mahnoor, Zara
No ratings yet
Final Project: Raiha, Maheen, Fabiha Mahnoor, Zara
14 pages
Stat Cor Reg
No ratings yet
Stat Cor Reg
85 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
Correlation
100% (1)
Correlation
29 pages
Econometrics 2
No ratings yet
Econometrics 2
27 pages
Cha 6
No ratings yet
Cha 6
8 pages
Regression: by Vijeta Gupta Amity University
No ratings yet
Regression: by Vijeta Gupta Amity University
15 pages
Handout 5 Correlation and Regression (Recovered)
No ratings yet
Handout 5 Correlation and Regression (Recovered)
6 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
CH 6
No ratings yet
CH 6
43 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
13simple linear regression
No ratings yet
13simple linear regression
127 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
7 Regression
No ratings yet
7 Regression
96 pages
Linear Regression Analysis: Gaurav Garg (IIM Lucknow)
No ratings yet
Linear Regression Analysis: Gaurav Garg (IIM Lucknow)
96 pages
Linear Regression (1)
No ratings yet
Linear Regression (1)
19 pages
Correlation and Regression
No ratings yet
Correlation and Regression
3 pages
Econometrics for Mgt ppt-2 (1)
No ratings yet
Econometrics for Mgt ppt-2 (1)
58 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
16 pages
Regression Course For Second Year (Chap 1-3)
No ratings yet
Regression Course For Second Year (Chap 1-3)
59 pages
Regression and Correlation
100% (1)
Regression and Correlation
9 pages
QT _Unit 2_Part B - Regression
No ratings yet
QT _Unit 2_Part B - Regression
40 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
correlation
No ratings yet
correlation
13 pages
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
No ratings yet
Lesson 12 - Introduction To Regression and Correlation Analysis Regression Analysis
39 pages
Regression and Correlation
No ratings yet
Regression and Correlation
13 pages
Topic 5-Lecture Notes
No ratings yet
Topic 5-Lecture Notes
12 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
How Can We Explore The Association Between Two Quantitative Variables?
No ratings yet
How Can We Explore The Association Between Two Quantitative Variables?
7 pages
Corr_Regression Analysis
No ratings yet
Corr_Regression Analysis
19 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Chapter Regression PDF
No ratings yet
Chapter Regression PDF
95 pages
bio6
No ratings yet
bio6
26 pages
STB1003_Unit-3 bsc
No ratings yet
STB1003_Unit-3 bsc
12 pages
lecture 6 linear regression
No ratings yet
lecture 6 linear regression
8 pages
SM 38
No ratings yet
SM 38
28 pages
Applied Statistics II Chapter 7 The Relationship Between Two Variables
No ratings yet
Applied Statistics II Chapter 7 The Relationship Between Two Variables
73 pages
Mda-Session-7 Simple Linear Regression
No ratings yet
Mda-Session-7 Simple Linear Regression
75 pages
Aalysis
No ratings yet
Aalysis
16 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
Linear Regression
No ratings yet
Linear Regression
216 pages
Regression
No ratings yet
Regression
12 pages
Chapter 8
No ratings yet
Chapter 8
45 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
46 pages
ASS#1-FINALS Doromal
No ratings yet
ASS#1-FINALS Doromal
8 pages
DSC 402
No ratings yet
DSC 402
14 pages
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
No ratings yet
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
4 pages
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Quasi-likelihood functions, generalized linear models,
No ratings yet
Quasi-likelihood functions, generalized linear models,
10 pages
JacobRich
No ratings yet
JacobRich
38 pages
7.logistic.randomintercept
No ratings yet
7.logistic.randomintercept
48 pages
GLM Slides 5 Continuous Response
No ratings yet
GLM Slides 5 Continuous Response
39 pages
Mcom Applied Syllabus of LU
No ratings yet
Mcom Applied Syllabus of LU
23 pages
CH 3 & 4 Practice Test Resit Version
No ratings yet
CH 3 & 4 Practice Test Resit Version
6 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
11 pages
Pavitt (1984)
No ratings yet
Pavitt (1984)
31 pages
Applied Economic - Quarter 1 - Module 1 (Week1) ###
No ratings yet
Applied Economic - Quarter 1 - Module 1 (Week1) ###
10 pages
BA ZG524 Advanced Statistical Methods
No ratings yet
BA ZG524 Advanced Statistical Methods
7 pages
Ecotrix Assignment
No ratings yet
Ecotrix Assignment
5 pages
Chapter 10 Power Point Slides
No ratings yet
Chapter 10 Power Point Slides
26 pages
Practical Session 1 Solved
No ratings yet
Practical Session 1 Solved
14 pages
Brooklyn College Economics Department Economics 4400w
No ratings yet
Brooklyn College Economics Department Economics 4400w
8 pages
STA 207 CH3 Question & Answer
No ratings yet
STA 207 CH3 Question & Answer
3 pages
Classical Least Squares Theory - Lecture Notes
No ratings yet
Classical Least Squares Theory - Lecture Notes
109 pages
Lecture Notes on Multicollinearity
No ratings yet
Lecture Notes on Multicollinearity
16 pages
Amay Narayan CV May 17
No ratings yet
Amay Narayan CV May 17
3 pages
MSCI570 - Lecture 8 - Advanced Regression Analysis 2022 Part 2
No ratings yet
MSCI570 - Lecture 8 - Advanced Regression Analysis 2022 Part 2
26 pages
Fullpapers Biometrikb0c04f2322full
No ratings yet
Fullpapers Biometrikb0c04f2322full
9 pages
VARMA For Battery Voltage Forecasting 3
No ratings yet
VARMA For Battery Voltage Forecasting 3
50 pages
Instrumental Variables
No ratings yet
Instrumental Variables
28 pages
Pengaruh Harga Dan Kualitas Produk Terhadap Keputusan Pembelian
No ratings yet
Pengaruh Harga Dan Kualitas Produk Terhadap Keputusan Pembelian
15 pages
Econometrics Theory and Practice 1
No ratings yet
Econometrics Theory and Practice 1
5 pages
Exercise 1
0% (1)
Exercise 1
5 pages
STT153A Paper
No ratings yet
STT153A Paper
8 pages
Econometrics Unit 7 Dummy Variables: Amado Peiró (Universitat de València)
No ratings yet
Econometrics Unit 7 Dummy Variables: Amado Peiró (Universitat de València)
15 pages
MGT782 - Assignment 3
No ratings yet
MGT782 - Assignment 3
8 pages
Eviews 7
No ratings yet
Eviews 7
5 pages
Economics Student Resume
100% (2)
Economics Student Resume
5 pages
Binomial Logistic Regression: Julia Hartman
No ratings yet
Binomial Logistic Regression: Julia Hartman
78 pages

Linear Regression Analysis_1

Uploaded by

Linear Regression Analysis_1

Uploaded by

Linear Regression Analysis

• Pearson’s correlation coefficient, r (rho, the population characteristic)

• Here we have the sample covariance between the two variables

Dependent variable , outcome, response

• The unknown parameters, often denoted as a scalar or vector .

You might also like