Open In App

How to Convert List of Regression Outputs into Data Frames in R

Last Updated : 26 Jun, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

When we run multiple regression models in R, you might end up with a list of regression outputs that you want to consolidate into a single data frame for easier analysis and comparison. This can be particularly useful when performing batch regression analysis across multiple datasets or different subsets of a single dataset. This article provides a comprehensive guide on how to convert a list of regression outputs into data frames in R.

What are Regression Outputs?

Regression outputs are the results obtained from fitting a regression model to data. They provide insight into the relationship between independent variables (predictors) and the dependent variable (response). Now we will discuss Step-by-Step How to Convert a List of Regression Outputs into Data Frames in the R Programming Language.

Step 1: Run Multiple Regression Models and Store Outputs in a List

First, let's run several regression models and store their outputs in a list.

R
# Load necessary library
library(stats)

# Generate example data
set.seed(123)
n <- 100
data <- data.frame(
  y1 = rnorm(n),
  y2 = rnorm(n),
  y3 = rnorm(n),
  x1 = rnorm(n),
  x2 = rnorm(n)
)

# Run multiple regression models and store outputs in a list
model_list <- list(
  lm(y1 ~ x1 + x2, data = data),
  lm(y2 ~ x1 + x2, data = data),
  lm(y3 ~ x1 + x2, data = data)
)

# Print the model list
print(model_list)

Output:


[[1]]

Call:
lm(formula = y1 ~ x1 + x2, data = data)

Coefficients:
(Intercept)           x1           x2  
    0.10779     -0.04201     -0.17865  


[[2]]

Call:
lm(formula = y2 ~ x1 + x2, data = data)

Coefficients:
(Intercept)           x1           x2  
   -0.09272      0.03848     -0.12689  


[[3]]

Call:
lm(formula = y3 ~ x1 + x2, data = data)

Coefficients:
(Intercept)           x1           x2  
    0.12158     -0.04148     -0.02470  

Step 2: Extract Regression Summaries

Next, we need to extract the summary statistics (e.g., coefficients, standard errors, p-values) from each model and store them in a list.

R
# Extract summaries of each model
summary_list <- lapply(model_list, summary)

# Print summaries
print(summary_list)

Output:

[[1]]

Call:
lm(formula = y1 ~ x1 + x2, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.53661 -0.64238 -0.03869  0.53731  2.12391 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)  0.10779    0.09095   1.185   0.2388  
x1          -0.04201    0.08746  -0.480   0.6321  
x2          -0.17865    0.09183  -1.945   0.0546 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9038 on 97 degrees of freedom
Multiple R-squared:  0.03942,    Adjusted R-squared:  0.01962 
F-statistic:  1.99 on 2 and 97 DF,  p-value: 0.1422


[[2]]

Call:
lm(formula = y2 ~ x1 + x2, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.9726 -0.6999 -0.0814  0.6153  3.2448 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.09272    0.09738  -0.952    0.343
x1           0.03848    0.09364   0.411    0.682
x2          -0.12689    0.09832  -1.291    0.200

Residual standard error: 0.9677 on 97 degrees of freedom
Multiple R-squared:  0.01877,    Adjusted R-squared:  -0.001462 
F-statistic: 0.9278 on 2 and 97 DF,  p-value: 0.3989


[[3]]

Call:
lm(formula = y3 ~ x1 + x2, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.87474 -0.64772 -0.06787  0.66962  2.17843 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.12158    0.09644   1.261    0.210
x1          -0.04148    0.09274  -0.447    0.656
x2          -0.02470    0.09737  -0.254    0.800

Residual standard error: 0.9583 on 97 degrees of freedom
Multiple R-squared:  0.002674,    Adjusted R-squared:  -0.01789 
F-statistic: 0.1301 on 2 and 97 DF,  p-value: 0.8782

Step 3: Convert Summaries to Data Frames

We'll create a function to extract the desired summary statistics and convert each summary into a data frame.

R
# Function to convert regression summary to data frame
summary_to_df <- function(summary_obj) {
  coefs <- summary_obj$coefficients
  df <- data.frame(
    term = rownames(coefs),
    estimate = coefs[, "Estimate"],
    std.error = coefs[, "Std. Error"],
    statistic = coefs[, "t value"],
    p.value = coefs[, "Pr(>|t|)"]
  )
  return(df)
}

# Apply the function to each summary
df_list <- lapply(summary_list, summary_to_df)

# Print the data frames
print(df_list)

Output:

[[1]]
                   term    estimate  std.error  statistic    p.value
(Intercept) (Intercept)  0.10779481 0.09095172  1.1851871 0.23883900
x1                   x1 -0.04201096 0.08746213 -0.4803332 0.63207201
x2                   x2 -0.17865367 0.09183242 -1.9454314 0.05461809

[[2]]
                   term    estimate  std.error  statistic   p.value
(Intercept) (Intercept) -0.09272146 0.09737947 -0.9521664 0.3433792
x1                   x1  0.03847563 0.09364325  0.4108745 0.6820708
x2                   x2 -0.12689206 0.09832240 -1.2905713 0.1999201

[[3]]
                   term    estimate  std.error  statistic   p.value
(Intercept) (Intercept)  0.12157671 0.09643802  1.2606720 0.2104511
x1                   x1 -0.04147896 0.09273793 -0.4472707 0.6556768
x2                   x2 -0.02469593 0.09737184 -0.2536250 0.8003221

Step 4: Combine Data Frames into a Single Data Frame

Finally, we can combine the individual data frames into a single data frame for easier comparison.

R
# Combine data frames into one
combined_df <- do.call(rbind, df_list)

# Add a model identifier
combined_df$model <- rep(paste0("Model_", 1:length(df_list)), each = nrow(df_list[[1]]))

# Print the combined data frame
print(combined_df)

Output:

                    term    estimate  std.error  statistic    p.value   model
(Intercept)  (Intercept)  0.10779481 0.09095172  1.1851871 0.23883900 Model_1
x1                    x1 -0.04201096 0.08746213 -0.4803332 0.63207201 Model_1
x2                    x2 -0.17865367 0.09183242 -1.9454314 0.05461809 Model_1
(Intercept)1 (Intercept) -0.09272146 0.09737947 -0.9521664 0.34337924 Model_2
x11                   x1  0.03847563 0.09364325  0.4108745 0.68207080 Model_2
x21                   x2 -0.12689206 0.09832240 -1.2905713 0.19992015 Model_2
(Intercept)2 (Intercept)  0.12157671 0.09643802  1.2606720 0.21045106 Model_3
x12                   x1 -0.04147896 0.09273793 -0.4472707 0.65567678 Model_3
x22                   x2 -0.02469593 0.09737184 -0.2536250 0.80032211 Model_3
  • We generate example data and run multiple regression models, storing their outputs in a list.
  • We extract the summary statistics of each model using the summary() function and store them in a list.
  • We create a function, summary_to_df, that converts the summary object into a data frame containing the coefficients, standard errors, t-values, and p-values. We then apply this function to each summary in the list.
  • We combine the individual data frames into a single data frame using do.call(rbind, ...) and add a model identifier column for clarity.

Conclusion

Converting a list of regression outputs into data frames in R is a valuable technique for summarizing and comparing multiple models. By following the steps outlined in this guide, you can efficiently extract and consolidate regression results, facilitating further analysis and interpretation. This approach is particularly useful in large-scale regression analysis, meta-analysis, and automated reporting workflows.


Next Article

Similar Reads