I run into the "
fit() must be used with GAM models (due to its use of formulas)." error when trying to fit a GAM inside a workflow set.
I have seen this post in the community that explains that we need to add "formula = gam_formula" inside the 'add_model()' function when defining a workflow. Error in `fit_xy()` with GAM model
However, I am not sure where I could do this when defining a workflow inside a workflow set.
Here's a reprex, slightly modified from the original post:
library(tidyverse, quietly = TRUE)
library(dplyr, quietly = TRUE)
library(sf, quietly = TRUE)
library(tidymodels, quietly = TRUE)
library(EnvStats, quietly = TRUE)
dep_var <- "y"
treat_var <- "xt"
ID_vars <- c("sampleID", "replicateID")
quant_indep <- c("X1", "X2", "X3")
qual_indep <- c("N1", "N2")
all_indep_vars <- c(ID_vars, quant_indep, qual_indep)
create artificial dataset
txt <- c("A", "B", "C")
samp_ids <- c("sample1", "sample2")
repl_ids <- c("rep1", "rep2", "rep3", "rep4", "rep5")
modl_df <- data.frame(y = EnvStats::rnormTrunc(250, mean = 0, sd = 0.7, min = -1.5, max = 1.5),
sampleID = array(sample(samp_ids), 250),
replicateID = array(sample(repl_ids),250),
xt = rnorm(250, mean = 0, sd = 1),
x = matrix(sample(runif(5000, min = -0.5, max = 1.5), 750), 250, 3),
c = matrix(sample(txt, 500, replace = TRUE), 250, 2))
modl_df <- modl_df %>% dplyr::rename(X1 = x.1, X2 = x.2, X3 = x.3, N1 = c.1, N2 = c.2)
my_split <- initial_split(modl_df)
train <- training(my_split)
test <- testing(my_split)
my_resamples <- vfold_cv(train, v = 3)
Workflow for GAM modeling
gam_formula <- as.formula("y ~ xt + sampleID + replicateID + X1 + X2 + X3 + N1 + N2")
gam_recipe <- recipe(formula = gam_formula, data= train)
gam_model <- gen_additive_mod(select_features = TRUE, engine = "mgcv") %>%
gam_wkflo <- workflow() %>%
add_model(gam_model, formula = gam_formula) %>%
response_gam <- gam_wkflo %>% parsnip::fit(data = modl_df)
preproc = list("recipe" = gam_recipe),
models = list(my_gam_spec = gam_model)
all_workflows <- all_workflows %>%
workflow_map(resamples = my_resamples, grid = 4, verbose = TRUE)
i No tuning parameters.
fit_resamples() will be attempted
i 1 of 1 resampling: recipe_my_gam_spec
→ A | error:
fit() must be used with GAM models (due to its use of formulas).
There were issues with some computations A: x3
1 of 1 resampling: recipe_my_gam_spec failed with
1: All models failed. Run
show_notes(.Last.tune.result) for more information.
2: Unknown or uninitialised column: