Model Summary Table as a Data Frame

Hi

I have been using stargazer and texreg to display tables of regression models. This has 90% worked, but to modify the tables often takes futzing with and editing the outputted HTML or LaTeX. Given some of the potential to make data frames look great in RMD documents (e.g., with kable(), kableExtra, flextable, gt) does anyone know if there is a package or function that can input a list of models and output a dataframe of the coefficients and model-level summary measures?

For example, the output could look like this:

         term            Model 1             Model 2           Model 3           Model 4
1 (Intercept) -86.08<br/>(13.68) -366.34<br/>(98.62)  65.27<br/>(3.25) -79.9<br/>(13.63)
2         sat   13.35<br/>(1.24)   62.72<br/>(17.27)              <NA>  13.02<br/>(1.22)
3    I(sat^2)               <NA>    -2.15<br/>(0.75)              <NA>              <NA>
4      public               <NA>                <NA> -14.24<br/>(5.91) 15.71<br/>(33.45)
5  public:sat               <NA>                <NA>              <NA>   -2.24<br/>(3.1)
6   r.squared               0.79               0.835             0.158             0.845
7       sigma               7.79                7.02             15.61              6.91
8        AICc              233.9               228.5             279.8             229.2

Have a look at the broom package. This might offer exactly what you are looking for?

JW

Does broom have functionality to display the results of multiple models? I know tidy() and glance() will output results from a single model, but I don't think you can give those functions multiple model objects,

Hello!

For presenting output from multiple models try gtsummary.

1 Like

Hey @AndyZ,

It's possible to get the results in a data frame using the gtsummary package. You will however need to dev version of the package. I've included an example below.

# install dev version
remotes::install_github("ddsjoberg/gtsummary@e65083183e201408c0550042a61f62ac8d4f958c")
library(gtsummary)

# build summary table for a model
t1 <- 
  lm(age ~ ttdeath + marker, trial) %>%
  tbl_regression()  %>%
  add_significance_stars(pattern = "{estimate}{stars}<br>({std.error})",
                         hide_se = TRUE) %>%
  add_glance_table(include = c(nobs, r.squared, AIC))

# merge model summaries together, and convert to data frame
tbl_merge(list(t1, t1)) %>%
  as_tibble(col_labels = FALSE)
#> # A tibble: 5 x 3
#>   label                  estimate_1       estimate_2      
#>   <chr>                  <chr>            <chr>           
#> 1 Months to Death/Censor -0.15<br>(0.210) -0.15<br>(0.210)
#> 2 Marker Level (ng/mL)   -0.02<br>(1.26)  -0.02<br>(1.26) 
#> 3 No. Obs.               179              179             
#> 4 R²                     0.003            0.003           
#> 5 AIC                    1,471            1,471

Created on 2021-03-25 by the reprex package (v1.0.0)

Awesome! Good to know that it is in the dev. I will check it out.
Andy

1 Like

Hi, I am following up on this question because I am also interested in making a model summary table. I wrote 25 days ago but was not able to make the reprex and I am encumbered working on a secure server. I have 24 models (6 cell-types x 4 models) and I would like to start off with a simple table with the four models as rows, and the 6 cell-types as columns. I have the summaries already in a table, so my problem is that I can not use the tbl_regression() function to make the gtsummary table. Would it be better to try with gt instead?

The gt package will definitely help you get your models saved in data frames ready for a publication-ready table.

Thank you. I am making progress. I made the table with {gt} and have a question about stars of significance for p.values. I used the function that Rich Iannone posted on GitHub (see attached). Unfortunately, I can´t have fmt_stars() and fmt_number() in the same chunk. One overrides the other. Attached is the code when the p-values are visible in the table, otherwise, if fmt_stars() is last, just the stars are visible and no p-values. Sorry not to have a reprex - I need to figure out how to export it from the server. Thank you for your help! image

image

if you can post a data frame and example code, I can attempt to help!

The data:

My table that will not show the stars of significance.

I get an error code that the data is not a gt_table format when I try and add footnotes.

Can you add the data and code as a code chunk instead of an image? Then I can copy the text into my editor?

Hi, I hope this works. This is the first time I have tried posting a reprex.

table_pval <- tibble::tribble(
                 ~Model,      ~Predictor,               ~Mono,                 ~Neu,                ~NK,               ~CD4T,               ~CD8T,              ~Bcell,
              "Model 1",  "Case-Control",   0.628417355082566, 0.000519604887456513, 0.0925500673787777, 0.00584722383176443, 0.00110636778126264, 0.00448981271134747,
             "Model 2a",  "Case-Control",   0.776347680248465,  0.00333944782506215,  0.286264314573566,  0.0186336833741003, 0.00974265172264351,  0.0260571208261442,
             "Model 2b", "Hour diff SCZ",  0.0684626664059519,    0.531113835122065,  0.232291839314412,   0.661674211090156,   0.303802763736016,    0.33771624522132,
              "Model 3", "Hour diff CTL", 0.00692340728223628,    0.851856936368742,   0.25075981812509,   0.945866144505559,   0.348892587472004,   0.680282551421654
             )
head(table_pval)

library(magrittr)
library(gt)

## function for writing stars of significance (https://github.com/rstudio/gt/issues/187)


fmt_stars <- function(tble.pval,
                      columns,
                      rows = NULL) {
  rows <- rlang::enquo(rows)
  
  fmt(
    data = tble.pval,
    columns = columns,
    rows = !!rows,
    fns = list(
      default = function(x) {
        
        x_str <-
          dplyr::case_when(
            between(x, 0, 0.001) ~ "***",
            between(x, 0, 0.01) ~ "**",
            between(x, 0, 0.05) ~ "*",
            TRUE ~ "."
          )
      }
    )
  )
}

## table of pvalues that doesn´t show stars of significance


tbl.pval <- gt(table_pval) %>%
  fmt_stars(columns = 3:8) %>%
  fmt_number(
    columns = 3:8,
    decimals = 3) %>% 
  tab_header(
    title = "SCZ status and Time-of-blood draw",
    subtitle = "Impact on estimated cell-type proportions"
  )

tbl.pval

## get an error that data is not gt_table format when trying to add a footnote 

tble.pval <- gt(table_pval) %>%
  fmt_stars(columns = 3:8)%>%
  fmt_number(
    columns = 3:8,
    decimals = 3) %>% 
  tab_header(
    title = "SCZ status and Time-of-blood draw",
    subtitle = "Impact on estimated cell-type proportions"
  ) %>%
  tab_footnote(
    footnote = md("Represents **hours from baseline 07:00**."),
    locations = cells_body(
      columns = 2, rows = starts_with("Hour")
    ) %>%
      opt_footnote_marks(marks = "+"))

tble.pval

Thanks for updating. Made a few changes to the code. I think this should work for you.

library(tidyverse)
library(gt)

table_pval <- 
  tibble::tribble(
    ~Model,      ~Predictor,               ~Mono,                 ~Neu,                ~NK,               ~CD4T,               ~CD8T,              ~Bcell,
    "Model 1",  "Case-Control",   0.628417355082566, 0.000519604887456513, 0.0925500673787777, 0.00584722383176443, 0.00110636778126264, 0.00448981271134747,
    "Model 2a",  "Case-Control",   0.776347680248465,  0.00333944782506215,  0.286264314573566,  0.0186336833741003, 0.00974265172264351,  0.0260571208261442,
    "Model 2b", "Hour diff SCZ",  0.0684626664059519,    0.531113835122065,  0.232291839314412,   0.661674211090156,   0.303802763736016,    0.33771624522132,
    "Model 3", "Hour diff CTL", 0.00692340728223628,    0.851856936368742,   0.25075981812509,   0.945866144505559,   0.348892587472004,   0.680282551421654
  )

# function to round p-value and append stars
add_stars <- function(x, decimals = 3) {
  # create vector of stars
  x_stars <-
    dplyr::case_when(
      x >= 0.05 ~ "",
      x >= 0.01 ~ "*",
      x >= 0.001 ~ "**",
      TRUE ~ "***"
    )
  
  # paste together rounded p-value and stars
  paste0(
    gtsummary::style_number(x, digits = decimals),
    x_stars
  )
}

# gt function to round p-values and add stars
fmt_stars <- function(data,
                      columns,
                      rows = NULL,
                      decimals = 3) {
  
  fmt(
    data = data,
    columns = {{ columns }},
    rows = {{ rows }},
    fns = purrr::partial(add_stars, decimals = decimals)
  )
}

gt(table_pval) %>%
  fmt_stars(columns = 3:8, decimals = 5) 

Thank you very, very much. It is helpful to read your solution. Do you see the second code I pasted for the table? I was trying to add a footnote and code an error code saying that the data was not formatted as a gt_table.

I am not sure what your footnote goals are. The text is about time, and you've got a table full of p-values. I would just review the tab_footnote() help file and go from there.

Ok will do. I thought I had to define in the table what the variable "Hour diff" meant. But thank you again for your help.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.