Use modelr::gather_predictions with list columns


#1

I’m trying to adopt a tidy approach to fitting multiple models. I’ve only recently begun using purrr, so may have some naive ideas about that package. I’m trying to fit multiple models to a single data sample. I describe the models in a tibble and map() manages to apply lm() and store results. However, I’m clearly missing something obvious when I try to use the column of lm model objects to store predictions and residuals in my sample data frame.

MWE below. (The formulae are daft and just for illustration.)

The final line will generate results, but I’m obviously trying to avoid explicit arguments. The penultimate line generates the following error:

Error in UseMethod("predict") : 
no applicable method for 'predict' applied to an object of class "list"
library(modelr)
library(tidyverse)

tblModel <- tibble(
    ModelName = c('drat + wt', '0 + drat + wt', 'hp + disp')
  , Formula = paste('mpg ~ ', ModelName)
  ) %>% 
  mutate(
    Model = map(Formula, lm, data = mtcars)
  )

tblPredictions <- modelr::gather_predictions(mtcars, tblModel$Model)
tblPredictions <- modelr::gather_predictions(mtcars, tblModel$Model[[1]], tblModel$Model[[2]])

#2

The issue is that gather_predictions takes dots (...), not a list, for the additional arguments. This means that you get to use another purrr function, lift! That is the same as lift_dl, as it turns a function that takes dots (d) into a list (l). Since mtcars isn’t part of that list, you pass that directly to lift, not the generated function. Also, I used purrr::set_names on the Model column, since without names the model column of tblPredictions is the lm result deparsed and is basically unusable.

library(modelr)
suppressPackageStartupMessages(library(tidyverse))

tblModel <- tibble(
  ModelName = c('drat + wt', '0 + drat + wt', 'hp + disp')
  , Formula = paste('mpg ~ ', ModelName)
) %>% 
  mutate(
    Model = map(Formula, lm, data = mtcars) %>%
      # Added set_names to make gather_predictions output readable
      set_names(ModelName)
  )

tblPredictions <- lift(modelr::gather_predictions, 
                       data = mtcars)(tblModel$Model)

glimpse(tblPredictions)
#> Observations: 96
#> Variables: 13
#> $ model <chr> "drat + wt", "drat + wt", "drat + wt", "drat + wt", "dra...
#> $ mpg   <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19...
#> $ cyl   <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4,...
#> $ disp  <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, ...
#> $ hp    <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180,...
#> $ drat  <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3....
#> $ wt    <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, ...
#> $ qsec  <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, ...
#> $ vs    <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1,...
#> $ am    <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,...
#> $ gear  <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4,...
#> $ carb  <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2,...
#> $ pred  <dbl> 23.384912, 22.165275, 24.747654, 19.356250, 18.381074, 1...

#3

That works brilliantly, thanks! I’d just started googling for conversion of list -> dots. Looks like lift() fits the bill.


#4

I should mention that lift/lift_dl is something of a wrapper around do.call, so you can get the same results like this:

tblPredictions <- do.call(gather_predictions, c(list(data = mtcars), tblModel$Model))

I like the fact that lift_* has other “directions” for argument translation, though.


#5

wow cool~
I didn't know lift_*