Comparing Two Linear Models with anova() in R

Last Updated : 13 Sep, 2024

Comparing two linear models is a fundamental task in statistical analysis, especially when determining if a more complex model provides a significantly better fit to the data than a simpler one. In R, the anova() the function allows you to perform an Analysis of Variance (ANOVA) to compare nested models.

What is a Linear Model?

A linear model describes the relationship between a response variable (dependent variable) and one or more explanatory variables (independent variables) using a linear equation.

ANOVA for Comparing Models

The Analysis of Variance (ANOVA) technique compares two nested models to determine if the more complex model provides a significantly better fit to the data. The anova() function in R performs this comparison by calculating an F-statistic and a p-value. The null hypothesis is that the simpler model is adequate, and the alternative hypothesis is that the more complex model is better. If the p-value is small (typically less than 0.05), we reject the null hypothesis and conclude that the complex model provides a significantly better fit.

anova(model1, model2)
model1: The simpler model.
model2: The more complex model.

Let’s explain how to compare two linear models using the mtcars dataset in R Programming Language. The mtcars dataset contains data about fuel consumption and other aspects of automobile design and performance for 32 cars.

Step 1: Loading the Data

We will use the mtcars dataset, which is preloaded in R. First, inspect the data:

# Load the dataset
data(mtcars)

# Display the first few rows of the dataset
head(mtcars)

Output:

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Step 2: Building the Models

We will build two models:

Model 1 (Simpler): This model predicts the miles per gallon (mpg) using only the weight (wt) of the car.
Model 2 (Complex): This model predicts mpg using both weight (wt) and horsepower (hp).

# Build Model 1: mpg as a function of weight (wt)
model1 <- lm(mpg ~ wt, data = mtcars)

# Build Model 2: mpg as a function of weight (wt) and horsepower (hp)
model2 <- lm(mpg ~ wt + hp, data = mtcars)

Step 3: Comparing the Models with `anova()`

Now, we use the anova() function to compare the two models.

# Compare the two models using ANOVA
anova_result <- anova(model1, model2)
print(anova_result)

Output:

Analysis of Variance Table

Model 1: mpg ~ wt
Model 2: mpg ~ wt + hp
  Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
1     30 278.32                                
2     29 195.05  1    83.274 12.381 0.001451 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Model 1 (Simpler) has 30 residual degrees of freedom (Res.Df) and a residual sum of squares (RSS) of 278.32.
Model 2 (Complex) has 29 residual degrees of freedom and an RSS of 180.29.
The difference in degrees of freedom (Df) between the two models is 1, and the sum of squares of the difference is 98.025.
The F-statistic is 15.775, and the p-value is 0.0004344.

Since the p-value is much less than 0.05, we reject the null hypothesis and conclude that Model 2 (which includes both wt and hp) provides a significantly better fit than Model 1 (which only includes wt).

Conclusion

In this article, we have explored how to compare two linear models using the anova() function in R. The ANOVA test provides a formal way to determine whether a more complex model provides a significantly better fit than a simpler model. This technique is particularly useful in model selection, stepwise regression, and hypothesis testing.

Comparing Two Linear Models with anova() in R

nyadavxenc

Improve

Article Tags :

Practice Tags :

Machine Learning

Comparing Two Linear Models with anova() in R

What is a Linear Model?

ANOVA for Comparing Models

Step 1: Loading the Data

Step 2: Building the Models

Step 3: Comparing the Models with anova()

Conclusion

Similar Reads

Thank You!

What kind of Experience do you want to share?

Step 3: Comparing the Models with `anova()`