The workflow object contains the preprocessing object (e.g. a recipe) and that stores all of the information used to encode/format/preprocess new data.
In some cases, the model function itself might do some of this. When that is the case, the model object would contain the training set statistics.
For example:
library(tidymodels)
tidymodels_prefer()
theme_set(theme_bw())
rec <-
recipe(mpg ~ ., data = mtcars) %>%
step_normalize(all_numeric_predictors(), id = "norm")
model_fit <-
workflow() %>%
add_recipe(rec) %>%
add_model(linear_reg()) %>%
fit(data = mtcars)
# Get the "fitted" recipe:
model_fit %>%
extract_recipe()
#> Recipe
#>
#> Inputs:
#>
#> role #variables
#> outcome 1
#> predictor 10
#>
#> Training data contained 32 data points and no missing data.
#>
#> Operations:
#>
#> Centering and scaling for cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb [trained]
# Get the training set means and sds
model_fit %>%
extract_recipe() %>%
tidy(id = "norm")
#> # A tibble: 20 × 4
#> terms statistic value id
#> <chr> <chr> <dbl> <chr>
#> 1 cyl mean 6.19 norm
#> 2 disp mean 231. norm
#> 3 hp mean 147. norm
#> 4 drat mean 3.60 norm
#> 5 wt mean 3.22 norm
#> 6 qsec mean 17.8 norm
#> 7 vs mean 0.438 norm
#> 8 am mean 0.406 norm
#> 9 gear mean 3.69 norm
#> 10 carb mean 2.81 norm
#> 11 cyl sd 1.79 norm
#> 12 disp sd 124. norm
#> 13 hp sd 68.6 norm
#> 14 drat sd 0.535 norm
#> 15 wt sd 0.978 norm
#> 16 qsec sd 1.79 norm
#> 17 vs sd 0.504 norm
#> 18 am sd 0.499 norm
#> 19 gear sd 0.738 norm
#> 20 carb sd 1.62 norm
Created on 2022-03-21 by the reprex package (v2.0.1)