I'm having trouble using
caret::extractPrediction() for a GLM which has factor covariates. See reprex below. It appears that
caret does not use the model matrix which is constructed by
glm(). I can construct the model matrix myself so that I'm not passing factors into
caret::train(), but this feels a bit hacky.
This question also appears on StackOverflow here: https://stackoverflow.com/questions/29490751/does-extractprediction-support-factors, but there are no answers.
library(tidyverse) #> ── Attaching packages ───────────────── tidyverse 1.2.1 ── #> ✔ ggplot2 3.0.0 ✔ purrr 0.2.5 #> ✔ tibble 1.4.2 ✔ dplyr 0.7.6 #> ✔ tidyr 0.8.1 ✔ stringr 1.3.1 #> ✔ readr 1.1.1 ✔ forcats 0.3.0 #> ── Conflicts ──────────────────── tidyverse_conflicts() ── #> ✖ dplyr::filter() masks stats::filter() #> ✖ dplyr::lag() masks stats::lag() library(caret) #> Loading required package: lattice #> #> Attaching package: 'caret' #> The following object is masked from 'package:purrr': #> #> lift data(mtcars) mtcars2 <- mtcars %>% mutate(cyl = as.factor(cyl)) train_glm <- caret::train( mpg ~ . , data = mtcars2 , method = 'glm' , trControl = trainControl( method = 'cv' , number = 5 ) ) extractPrediction( list(train_glm) ) #> Error in eval(predvars, data, env): object 'cyl6' not found