Using plot() with tidymodels - what does extract_fit_engine() return an abstraction

I am trying to understand the logic behind extract_fit_engine(). I want to use plot() on a model run in the tidymodels ecosystem. I learned yesterday about extract_fit_engine(). What I don't understand is why does it return an "abstraction" for the call rather than the model itself.

Here is what I am talking about:

library(ISLR2)
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(tidymodels))

Smarket_train <- Smarket %>%
  filter(Year != 2005)

Smarket_test <- Smarket %>% # Smarket.2005
  filter(Year == 2005)

library(discrim)
lda_spec <- discrim_linear() %>%
  set_mode("classification") %>%
  set_engine("MASS")

the_rec <- recipe(
  Direction ~ Lag1 + Lag2,
  data = Smarket
)

the_workflow <- workflow() %>%
  add_recipe(the_rec) %>%
  add_model(lda_spec)

the_workflow_fit_lda <-
  fit(the_workflow, data = Smarket_train)

the_workflow_fit_lda_fit0 <-
  the_workflow_fit_lda %>%
  extract_fit_engine() 

This give me a lda object ... fabulous!

class(the_workflow_fit_lda_fit0)

but the call is an abstraction which does not reference the actual variables/data

the_workflow_fit_lda_fit0$call

So, I can't use it with plot().

I know I can extract a parsnip object , repair it then extract the engine:

the_workflow_fit_lda_fit1 <-
  the_workflow_fit_lda %>%
  extract_fit_parsnip() 

the_workflow_fit_lda_fit999 <-
  the_workflow_fit_lda_fit1 %>%
  parsnip::repair_call(rename(Smarket_train, ..y = Direction)) %>%
  extract_fit_engine() 

class(the_workflow_fit_lda_fit999)

the_workflow_fit_lda_fit999 %>%
  plot()

That extra step was extremely counterintuitive and I think it will make my students queasy. Is there a function that will pull a useable fit directly from a tidymodels workflow? Given the warning, mentioned in the draft tidymodels book (6 Fitting models with parsnip | Tidy Modeling with R) about pulling the parsnip object directly from a workflow makes me nervous. Perhaps there should be a extract_fit_results() that automatically does the repair_call() work above. Does that make sense or am I totally not understanding the logic here?

We keep things modular and do operations only when they are needed. repair_call()requires the original data and we don't want people to have the data just to get the object (many of which do not have call objects).

I know it is an extra step; the underlying issue is the shortsightedness of the original author that makes assumptions about what is in the environment when the function was run and that these objects are always just hanging around.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.