0% found this document useful (0 votes)

150 views

ANOVA, Correlation and Regression: Dr. Faris Al Lami MB, CHB PHD FFPH

1. ANOVA can test differences between means of more than two samples and determine if population means are equal. 2. Pearson's correlation coefficient measures the strength and direction of linear relationship between two quantitative variables. A value closer to 1 or -1 indicates a stronger correlation. 3. Simple linear regression generates an equation to predict the value of a dependent variable based on the value of an independent variable, and the regression coefficient indicates the strength of their association.

Uploaded by

محمود محمد

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

150 views

ANOVA, Correlation and Regression: Dr. Faris Al Lami MB, CHB PHD FFPH

Uploaded by

محمود محمد

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 40

ANOVA, Correlation

and Regression
Dr. Faris Al Lami
MB,ChB PhD FFPH
ANALYSIS OF VARIANCE
(ANOVA, F-test)
Learning Objectives

1.Using analysis of variance (ANOVA) to test difference

between means of more than 2.samples
3.Obtain a measure of the linear relationship between
two quantitative random variables (X &Y).
4.Interpret the scatter diagram.
5.Interpret the value of linear correlation coefficient (r).
6. Identify the use of simple linear regression
ANALYSIS OF VARIANCE (ANOVA)
• It is a technique where a total variation present in a set of data
is partitioned into several components, each component has its
specific variation.

• ANOVA can ascertain the magnitude of the contribution of each

of these sources to the total variation
 ANOVA tests the hypothesis that three or more population
means are equal.
H0: µ1 = µ2 = µ3 = . . . µk
H1: At least one mean is different
ANOVA methods
require the F-distribution
1. The F-distribution is not symmetric; it is skewed to the right.
2. The values of F can be 0 or positive, they cannot be negative.
3. There is a different F-distribution for each pair of degrees of
freedom for the numerator and denominator. Critical values of F
are given in F Table
One-Way ANOVA
Assumptions
1. The populations have approximately normal
distributions.
2. The populations have the same variance 2
3. The samples are simple random samples.
4. The samples are independent of each other.
5. The different samples are from populations which are
categorized in only one way.
The total sum of squares (SST)
• It is the sum of squares of the deviations of each observation from
the mean of all the observations taken together

Within Group Sum of Squares (SSW)

• Computing within each group the sum of the squared deviations of
the individual observation from their mean. Also called Error

Among Groups Sum of Squares (SSA)

• Computing for each group the squared deviation of the group mean
from the grand mean and to multiply the result by the size of the
group

•SST= SSA + SSW

Mean Squares (MS)
Sum of Squares SS (Among) and SS (within) divided by
corresponding number of degrees of freedom.
SS (total)
MST =
N–1

SS (within)
MSW =
N–k

SS (Among)
MSA =
k–1
Test Statistic for ANOVA

MS (Among)
F=
MS (Within)

 Numerator df = k – 1
 Denominator df = N – k
SPSS output
Go to options and select descriptives
ANOVA model-Example

As shown in the descriptive output, the mean score was

highest for the front row and lowest for the back row.
Null hypothesis: no difference between the 3 population
mean scores. i.e. there is nothing special about the
sitting position affecting the students score.
F-test

The F ratio is 15.7.

P ANOVA is <0.001.
We reject the null hypothesis.
There is a statistically significant difference in mean
score between at least one group compared to others.
ANOVA model-Where is the difference

• It is not possible to identify where exactly is the difference

in mean? (Front Vs Middle……Front Vs Back……Middle Vs
Back).

• Independent samples t-test can’t be used since this will

lead to increase type 1 error (alpha value) more than 0.05.
Testing for Significant Differences
Between Individuals Pairs of Means
• Bonferroni t-test
• Tukey’s HSD Test
• Multiple comparison procedure used for the
null hypothesis that all possible pairs of
treatment means are equal.
• HSD use single value against which differences
are compared
THE CORRELATION
MODEL
OBJECTIVE
•Obtain a measure of the relationship
between two random variables (X &Y)
Pearson’s Correlation Coefficient (r)

It is a measure of the linear (or straight line)

relationship between two interval level
variables

•Its value lies between (-1---- +1)

-1: perfect inverse linear correlation
+1: Perfect positive linear correlation
0: No correlation
Pearson’s Correlation Coefficient
(r)
• The value of (r) indicates the strength of the
relationship
• <0.2 : very weak
• 0.2- <0.4 : weak
• 0.4- <0.7 : moderate
• 0.7- <0.9 : strong
• ≥0.9 : very strong
Pearson’s Correlation Coefficient
(r)
• The sign of (r) indicates the direction of the
relationship
• Positive correlation indicates that high score
on one variable is associated with high scores
on a second variable
• Negative correlation indicates that high
scores on one variable is associated with low
scores on the second variable
Testing significance of (r)
• The (r ) value represents a sample value and the
P (rho) represent the population value.
• Ho P (rho)=0
• HA P≠0
n-2
• t=r √-----------
1-r2
df=n-2
A larger (r) and a bigger sample size is associated with a
higher calculated t- value and thus higher probability of
being statistically significant.
Scatter Diagram
• Scatter Diagram is a graphic device used to visually
summarize the relationship between two variables
• The X-axis is represents the independent variable
• The Y –axis represents the dependent variable
• In correlation model it is not important to know which
is the dependent and independent variable, while in
regression model this distinction is crucial.
• The closer the dots that represent pairs of observations
for study subjects to the regression line the stronger is
the linear correlation.
Patient No. Method I Method II
1 132 130
2 138 134
3 144 132
4 146 140
5 148 150
6 152 144
7 158 150
8 130 122
9 162 160

Systolic Blood Pressure 10 168 150

11 172 160
Readings (mmHg) by two 12 174 178
methods in 25 Patients with 13 180 168

Essential Hypertension 14 180 174

15 188 186
16 194 172
17 194 182
18 200 178
19 200 196
20 204 188
21 210 180
22 210 196
23 216 210
24 220 190
25 220 202
220

200
Method II

180

160

140

120

100
100 120 140 160 180 200 220
Method I

Systolic Blood pressure readings (mm Hg), 25 Patients with essential hypertension
SPSS Output
SPSS Output
Example-SPSS output

• The linear correlation coefficient is 0.955

• Its P value is <0.001.
• There is a statistically significant very strong
direct linear correlation between method 1 and
method 2 for measuring systolic blood pressure.
Simple Linear Regression

Y= a+ bx
Simple Linear Regression

• It is another way to quantify the strength of

association between 2 quantitative variables under
the assumption of normal distribution (Dose-
response relationship).

• The independent variable (x) is pre-selected and

called non-random or mathematical variable. For
each value of x there is a set of normally distributed
values of Y.
Simple Linear Regression
The least square line summarizes the relationship
between X and Y:
Y= a+ bx
a= intercept : the point where the line crosses the
vertical axis (i.e.: amount of Y when X= 0)
b=slope : amount by which Y changes for each
change in X
X=independent variable
Y=dependent variable
Simple Linear Regression
It is helpful in:
• Ascertaining the probable form of the relationship
between variables
• Predict or estimate the value of one variable
corresponding to a given value of another variable. If we
enter a specific value of X in the regression equation one
can predict the value of Y.
• Assessing the strength of association between 2
quantitative variables measured on interval/ratio scale.
The higher the value of b (regression coefficient) the
stronger is the effect of x on the value of Y. (stronger
dose-response linear relation).
• Power of prediction of the model: is measured by R2
(determination coefficient) which is equal to the square
value of r (linear correlation coefficient). It measures
the proportion of observed variation in the response
variable explained by the regression model.
• The least square method is used to estimate the 2
points needed to draw the regression line.
• The calculated regression coefficient (beta or slope) is
tested for statistical significance by t-test against the
null hypothesis of beta=0 at the population level.
• The overall regression equation is tested for statistical
significance by ANOVA. The model should be
statistically significant before we are able to generalize
the results to reference population.
Patient No. Method I Method II
1 132 130
2 138 134
3 144 132
4 146 140
5 148 150
6 152 144
7 158 150
8 130 122
9 162 160

Systolic Blood Pressure 10 168 150

11 172 160
Readings (mmHg) by two 12 174 178
methods in 25 Patients with 13 180 168

Essential Hypertension 14 180 174

15 188 186
16 194 172
17 194 182
18 200 178
19 200 196
20 204 188
21 210 180
22 210 196
23 216 210
24 220 190
25 220 202
SPSS output
SPSS output
SPSS output

The first table in SPSS output is descriptive, specifying

the independent and dependent variables.
SPSS output

Unstandardized B (beta) is the regression coefficient. For each 1

score increase in the new test the standardized test score is
expected to increase by an average of 1.1.
The calculated regression coefficient was statistically significant
(P<0.001).
Regression equation Y = -1 + 1.124 X
Y is the response (dependent) variable and X is the explanatory
(independent) variable.
SPSS output

R is the simple linear correlation coefficient.

R square is the determination coefficient.
The new test explains 91.4% of variation in the
dependent variable.
SPSS output

•ANOVA model tests the statistical significance of

the regression model.
•P value is <0.001, which tells that the regression
model is statistically significant..

ICC Module 1 - 4
No ratings yet
ICC Module 1 - 4
34 pages
Lecture 5: Karnaugh Maps: CS1104: Computer Organisation
No ratings yet
Lecture 5: Karnaugh Maps: CS1104: Computer Organisation
71 pages
Hypothesis Testing in Six Sigma
No ratings yet
Hypothesis Testing in Six Sigma
19 pages
Module in Statistics and Probability
100% (1)
Module in Statistics and Probability
4 pages
SHS12 Q1 Mod5 Contemporary Philippine Arts From The Regions Contemporary Ar
No ratings yet
SHS12 Q1 Mod5 Contemporary Philippine Arts From The Regions Contemporary Ar
31 pages
Cpar q1m7 Living With The Artists and The Cocky of The
No ratings yet
Cpar q1m7 Living With The Artists and The Cocky of The
26 pages
Discrete Structures Course Outline Fall 2022 CS
100% (1)
Discrete Structures Course Outline Fall 2022 CS
3 pages
Definition of Parametric Test: Predictor Variable Outcome Variable Research Question Example Paired T-Test
No ratings yet
Definition of Parametric Test: Predictor Variable Outcome Variable Research Question Example Paired T-Test
20 pages
Excel Parts
No ratings yet
Excel Parts
3 pages
Chapter 2 True False
No ratings yet
Chapter 2 True False
8 pages
The Roles of Artists
No ratings yet
The Roles of Artists
16 pages
Deductive and Inductive
No ratings yet
Deductive and Inductive
72 pages
Nonstate Institutions-Developmental Agencies
No ratings yet
Nonstate Institutions-Developmental Agencies
20 pages
MMW Lesson 1
No ratings yet
MMW Lesson 1
10 pages
Module 4
No ratings yet
Module 4
10 pages
Week 4 Contemporary Philippine Arts From The Regions
No ratings yet
Week 4 Contemporary Philippine Arts From The Regions
36 pages
Statistics and Probability - Notes
No ratings yet
Statistics and Probability - Notes
4 pages
Gen Phy 1
100% (1)
Gen Phy 1
195 pages
ADM G11 Q4 Week 5 Week 10 Set B Updated
No ratings yet
ADM G11 Q4 Week 5 Week 10 Set B Updated
38 pages
Ucsp Week1 Day 1& Day 2
No ratings yet
Ucsp Week1 Day 1& Day 2
35 pages
Ballroom Dancing
No ratings yet
Ballroom Dancing
14 pages
Week 2
No ratings yet
Week 2
5 pages
CORE CPAR Q4 Mod5 W3-5 ResearchesonTechniques LocalMaterials
No ratings yet
CORE CPAR Q4 Mod5 W3-5 ResearchesonTechniques LocalMaterials
23 pages
Practical Research II Presentation of Data
No ratings yet
Practical Research II Presentation of Data
9 pages
Muntindilaw National High School: Saint Martin de Porres ST., Sitio Dilain, Brgy. Muntindilaw, Antipolo City, Rizal
No ratings yet
Muntindilaw National High School: Saint Martin de Porres ST., Sitio Dilain, Brgy. Muntindilaw, Antipolo City, Rizal
4 pages
Computer Fundamental and Programming: Prepared By: Mary Kris P. Morco Professorial Lecturer
No ratings yet
Computer Fundamental and Programming: Prepared By: Mary Kris P. Morco Professorial Lecturer
52 pages
Types of Variables
No ratings yet
Types of Variables
31 pages
Cut Cpar Week 12
No ratings yet
Cut Cpar Week 12
12 pages
Products and Services - Definitions, Examples, Differences
No ratings yet
Products and Services - Definitions, Examples, Differences
6 pages
Inquiries, Investigations Inquiries, Investigations and Immersion and Immersion
No ratings yet
Inquiries, Investigations Inquiries, Investigations and Immersion and Immersion
18 pages
Stat and Prob - Q4 - Mod2 - Identifying Parameters For Testing in Given Real Life Problems
No ratings yet
Stat and Prob - Q4 - Mod2 - Identifying Parameters For Testing in Given Real Life Problems
20 pages
35 Algorithm Types
No ratings yet
35 Algorithm Types
22 pages
Empowerment Technologies: 3 Quarter Week 4
No ratings yet
Empowerment Technologies: 3 Quarter Week 4
12 pages
Arts Appreciation
No ratings yet
Arts Appreciation
3 pages
Physical Education and Health: (Module 2)
No ratings yet
Physical Education and Health: (Module 2)
6 pages
Activity 1: Make A Choice!: Name of Material: Factors To Consider
No ratings yet
Activity 1: Make A Choice!: Name of Material: Factors To Consider
4 pages
Hypothesis Testing For The Mean (Small Samples)
No ratings yet
Hypothesis Testing For The Mean (Small Samples)
40 pages
Module 1
No ratings yet
Module 1
15 pages
Lesson 4: Advanced Spreadsheet Skills: COUNTIF Function Syntax
No ratings yet
Lesson 4: Advanced Spreadsheet Skills: COUNTIF Function Syntax
1 page
SHS English PR 2 Q3 W5
No ratings yet
SHS English PR 2 Q3 W5
20 pages
Techniques and Performance Practices
No ratings yet
Techniques and Performance Practices
46 pages
P. E. and Health Report
No ratings yet
P. E. and Health Report
20 pages
UCSP
No ratings yet
UCSP
25 pages
Chapter III
No ratings yet
Chapter III
2 pages
FINAL (PS) - PR2 11 - 12 - UNIT 7 - LESSON 1 - Descriptive Statistics For Quantitative Data
No ratings yet
FINAL (PS) - PR2 11 - 12 - UNIT 7 - LESSON 1 - Descriptive Statistics For Quantitative Data
59 pages
Philart Notes
No ratings yet
Philart Notes
11 pages
Pe 12 Week 5 and 6
No ratings yet
Pe 12 Week 5 and 6
25 pages
4 QRT References: Module 1: Mediums in Contemporary Art
No ratings yet
4 QRT References: Module 1: Mediums in Contemporary Art
31 pages
Lecture 3.2.1 PPT - Challenges Faced by Global Managers
No ratings yet
Lecture 3.2.1 PPT - Challenges Faced by Global Managers
15 pages
Math
No ratings yet
Math
11 pages
Philo Q1 M7
No ratings yet
Philo Q1 M7
14 pages
Lesson I: Department of Education La Victoria High School
No ratings yet
Lesson I: Department of Education La Victoria High School
10 pages
Real Vector Spaces and Subspaces (Final)
No ratings yet
Real Vector Spaces and Subspaces (Final)
15 pages
Finnaaaaaa Research To Be Print
No ratings yet
Finnaaaaaa Research To Be Print
14 pages
Gemmw01x - CM1-1 PDF
No ratings yet
Gemmw01x - CM1-1 PDF
10 pages
Statistics and Probability11 - Q4 - Mod22
No ratings yet
Statistics and Probability11 - Q4 - Mod22
26 pages
Q2 Entrep Module 4 Week 4 Revised2Final With Answer Keys
100% (1)
Q2 Entrep Module 4 Week 4 Revised2Final With Answer Keys
20 pages
MMW Activity 4
No ratings yet
MMW Activity 4
4 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
PARAMETRIC-TEST
No ratings yet
PARAMETRIC-TEST
49 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
Medical Statistics New
No ratings yet
Medical Statistics New
46 pages
18 Smooth Muscle
No ratings yet
18 Smooth Muscle
3 pages
Centrifuge: Mosab Nouraldein Mohammed
No ratings yet
Centrifuge: Mosab Nouraldein Mohammed
25 pages
Fourier Transform
No ratings yet
Fourier Transform
35 pages
Imaging The Common Bile Duct
No ratings yet
Imaging The Common Bile Duct
1 page
Heart Failure Signs and Symptoms:: It Is Inability of The Heart To Pump Enough Blood To The Body Tissues
No ratings yet
Heart Failure Signs and Symptoms:: It Is Inability of The Heart To Pump Enough Blood To The Body Tissues
4 pages
Reducing Transmission h1n1 2009
No ratings yet
Reducing Transmission h1n1 2009
9 pages
Drug Resistance: College of Ashur University Department of Pharmacy Pharmacology
No ratings yet
Drug Resistance: College of Ashur University Department of Pharmacy Pharmacology
12 pages
Cloning Vector: BY: Dr. Mukesh Kumar Bhardwaj
No ratings yet
Cloning Vector: BY: Dr. Mukesh Kumar Bhardwaj
23 pages
Plastid
No ratings yet
Plastid
31 pages
In The Distillation Controlling of Vinyl Acetate Production of
No ratings yet
In The Distillation Controlling of Vinyl Acetate Production of
15 pages
Hypotension: Sandeep Sharma Priyanka T. Bhattacharya
No ratings yet
Hypotension: Sandeep Sharma Priyanka T. Bhattacharya
10 pages
What Is High Blood Pressure?: Heart
No ratings yet
What Is High Blood Pressure?: Heart
2 pages
Free Radicals
No ratings yet
Free Radicals
15 pages
Edwards' Syndrome
0% (1)
Edwards' Syndrome
2 pages
What Are Free Radicals ?
No ratings yet
What Are Free Radicals ?
4 pages
Medical Physical Report
No ratings yet
Medical Physical Report
4 pages
Adjectives::Ghadeer Abdullah Salman:A
No ratings yet
Adjectives::Ghadeer Abdullah Salman:A
5 pages
AL-ZAHRAWI Univercity Pharmacy Dept. Second Stage
No ratings yet
AL-ZAHRAWI Univercity Pharmacy Dept. Second Stage
6 pages
F-Test - Definition, Statistics, Calculation, Interpretation, Example
No ratings yet
F-Test - Definition, Statistics, Calculation, Interpretation, Example
2 pages
PTNL đọc - 01
No ratings yet
PTNL đọc - 01
11 pages
MTH6134 Notes11
No ratings yet
MTH6134 Notes11
77 pages
BUSA3015 2023 S1 Report 2
No ratings yet
BUSA3015 2023 S1 Report 2
9 pages
(LBOLYTC) Notes
No ratings yet
(LBOLYTC) Notes
12 pages
Thesis Using Anova
100% (3)
Thesis Using Anova
8 pages
Does Islamic Banking Development Regarding Size and Activity Lead To Financial Crisis 1528 2635 22 SI 151
No ratings yet
Does Islamic Banking Development Regarding Size and Activity Lead To Financial Crisis 1528 2635 22 SI 151
8 pages
Chapter 13: Introduction To Analysis of Variance
No ratings yet
Chapter 13: Introduction To Analysis of Variance
26 pages
The Impact of Imports and Exports Performance On The Economic Growth of Somalia
No ratings yet
The Impact of Imports and Exports Performance On The Economic Growth of Somalia
10 pages
Applied Econometric Time Series 3Rd Ed.: Chapter 3: Modeling Volatility
No ratings yet
Applied Econometric Time Series 3Rd Ed.: Chapter 3: Modeling Volatility
80 pages
Effect of Sales Growth, Profitability, Liquidity and Leverage On Profit Growth
No ratings yet
Effect of Sales Growth, Profitability, Liquidity and Leverage On Profit Growth
7 pages
Dufflo
No ratings yet
Dufflo
20 pages
Mutliple Regression-Mcqs
No ratings yet
Mutliple Regression-Mcqs
10 pages
Week No. Topic: Inferential Statistics: Simple Test of Hypothesis - The Z-Test and The T-Test Statistical Tools. What Is A Hypothesis?
No ratings yet
Week No. Topic: Inferential Statistics: Simple Test of Hypothesis - The Z-Test and The T-Test Statistical Tools. What Is A Hypothesis?
12 pages
Arch and Garch
No ratings yet
Arch and Garch
39 pages
Statistics for Data Science and Analytics 1st Edition Peter C. Bruce instant download
100% (1)
Statistics for Data Science and Analytics 1st Edition Peter C. Bruce instant download
71 pages
Social Media Effects On AAA Baseball Attendance
No ratings yet
Social Media Effects On AAA Baseball Attendance
36 pages
EDA Unit-2
No ratings yet
EDA Unit-2
24 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
29 pages
Comparison of Selected Motor Ability Variables Among Football Players of Different Positional Play
No ratings yet
Comparison of Selected Motor Ability Variables Among Football Players of Different Positional Play
7 pages
8th Semester Questions (Mid Final) - Autumn20
No ratings yet
8th Semester Questions (Mid Final) - Autumn20
13 pages
Price Transmission Asymmetry in Spatial Grain Markets in Ethiopia
No ratings yet
Price Transmission Asymmetry in Spatial Grain Markets in Ethiopia
11 pages
Latin Square (Revised)
No ratings yet
Latin Square (Revised)
28 pages
STATISTICS Siegel
No ratings yet
STATISTICS Siegel
8 pages
JCN 9 381 PDF
No ratings yet
JCN 9 381 PDF
1 page
ANOVA Malhotra Mr7e 16updated
No ratings yet
ANOVA Malhotra Mr7e 16updated
63 pages
Chapter - II Review of Literature and Research Methodology
100% (1)
Chapter - II Review of Literature and Research Methodology
42 pages
Anova
No ratings yet
Anova
38 pages
Unit 4
No ratings yet
Unit 4
18 pages

ANOVA, Correlation and Regression: Dr. Faris Al Lami MB, CHB PHD FFPH

Uploaded by

ANOVA, Correlation and Regression: Dr. Faris Al Lami MB, CHB PHD FFPH

Uploaded by

ANOVA, Correlation

1.Using analysis of variance (ANOVA) to test difference

• ANOVA can ascertain the magnitude of the contribution of each

Within Group Sum of Squares (SSW)

Among Groups Sum of Squares (SSA)

•SST= SSA + SSW

As shown in the descriptive output, the mean score was

The F ratio is 15.7.

• It is not possible to identify where exactly is the difference

• Independent samples t-test can’t be used since this will

It is a measure of the linear (or straight line)

•Its value lies between (-1---- +1)

Systolic Blood Pressure 10 168 150

Essential Hypertension 14 180 174

• The linear correlation coefficient is 0.955

• It is another way to quantify the strength of

• The independent variable (x) is pre-selected and

Systolic Blood Pressure 10 168 150

Essential Hypertension 14 180 174

The first table in SPSS output is descriptive, specifying

Unstandardized B (beta) is the regression coefficient. For each 1

R is the simple linear correlation coefficient.

•ANOVA model tests the statistical significance of

You might also like