Open In App

How to Extract the Residuals and Predicted Values from Linear Model in R?

Last Updated : 13 Sep, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Extracting residuals and predicted (fitted) values from a linear model is essential in understanding the model's performance. The lm() function fits linear models in R and you can easily extract residuals and predicted values using built-in functions. This article will guide you through the steps and theory behind residuals and predicted values.

Linear Model

A linear model assumes that the relationship between the dependent variable Y and the independent variables X1, X2,…, Xn can be modeled as:

Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n + \epsilon

  • β0​ is the intercept.
  • β1,…,βn​ are the coefficients.
  • ϵ is the error term (residuals).

Predicted Values

Predicted (or fitted) values are the estimated values of the dependent variable Y based on the linear model. They represent the expected values of Y for given values of the independent variables.

\hat{Y} = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \dots + \beta_n X_n

Residuals

Residuals represent the difference between the observed values of Y and the predicted values Y^. They indicate how much the model's predictions deviate from the actual values.

\text{Residual} = Y - \hat{Y}

Extracting Residuals and Predicted Values

We will use the built-in mtcars dataset for this example, where we fit a linear model with mpg as the dependent variable and wt (weight) as the independent variable.

Step 1: Fit the Linear Model

First we will Fit the Linear Model.

R
# Fit a linear model
model <- lm(mpg ~ wt, data = mtcars)
summary(model)

Output:

Call:
lm(formula = mpg ~ wt, data = mtcars)

Residuals:
Min 1Q Median 3Q Max
-4.5432 -2.3647 -0.1252 1.4096 6.8727

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
wt -5.3445 0.5591 -9.559 1.29e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.046 on 30 degrees of freedom
Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10

Step 2: Extract the Predicted Values

To extract the predicted values, use the fitted() function.

R
# Extract predicted values
predicted_values <- fitted(model)
head(predicted_values)

Output:

        Mazda RX4     Mazda RX4 Wag        Datsun 710    Hornet 4 Drive Hornet Sportabout           Valiant 
23.28261 21.91977 24.88595 20.10265 18.90014 18.79325

The fitted() function returns the predicted values (or fitted values) for all observations based on the linear model.

Step 3: Extract the Residuals

To extract the residuals, use the residuals() function.

R
# Extract residuals
residuals_values <- residuals(model)
head(residuals_values)

Output:

        Mazda RX4     Mazda RX4 Wag        Datsun 710    Hornet 4 Drive Hornet Sportabout           Valiant 
-2.2826106 -0.9197704 -2.0859521 1.2973499 -0.2001440 -0.6932545

The residuals() function returns the difference between the observed and predicted values for all observations.

Conclusion

Extracting the residuals and predicted values from a linear model in R is simple using the fitted() and residuals() functions. These values are crucial for evaluating model performance and identifying areas where the model may not fit the data well. You can also visualize these relationships to get a clearer understanding of the model's fit.


Next Article

Similar Reads