Over the break I've been on a package to do this called workflowsets
. It can make different combinations of models and formulas (and other stuff too).
It's beyond experimental but the api might change slightly as people begin to use it:
# will require some devel tidymodels packages:
# remotes::install_github("tidymodels/workflowsets")
library(tidymodels)
#> ── Attaching packages ────────────────────────────────────── tidymodels 0.1.2 ──
#> ✓ broom 0.7.3 ✓ recipes 0.1.15.9000
#> ✓ dials 0.0.9.9000 ✓ rsample 0.0.8
#> ✓ dplyr 1.0.2 ✓ tibble 3.0.4
#> ✓ ggplot2 3.3.3 ✓ tidyr 1.1.2
#> ✓ infer 0.5.3 ✓ tune 0.1.2.9000
#> ✓ modeldata 0.1.0.9000 ✓ workflows 0.2.1
#> ✓ parsnip 0.1.4.9000 ✓ yardstick 0.0.7.9000
#> ✓ purrr 0.3.4
#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
#> x recipes::step() masks stats::step()
library(workflowsets)
# Define a list of preprocessors, such as formulas or recipes
formulas <-
list(
mod_1 = Species ~ Sepal.Length+Sepal.Width,
mod_2 = Species ~ Sepal.Length+Petal.Width,
mod_3 = Species ~ Sepal.Length+Sepal.Width+Petal.Length
)
# Define a model to use
model_spec <- multinom_reg() %>% set_engine("nnet", trace = 0)
# Combine a list of models and list of preprocessors
iris_set <- workflow_set(formulas, models = list(glm = model_spec))
iris_set
#> # A workflow set/tibble: 3 x 6
#> wflow_id preproc model object option result
#> <chr> <chr> <chr> <list> <list> <list>
#> 1 mod_1_glm formula multinom_reg <workflow> <list [0]> <list [0]>
#> 2 mod_2_glm formula multinom_reg <workflow> <list [0]> <list [0]>
#> 3 mod_3_glm formula multinom_reg <workflow> <list [0]> <list [0]>
# Evaluate them using the bootstrap:
set.seed(1)
bt <- bootstraps(iris, times = 50)
iris_results <-
iris_set %>%
workflow_map("fit_resamples", resamples = bt, seed = 2)
#>
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#>
#> %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#> flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#> splice
#>
#> Attaching package: 'vctrs'
#> The following object is masked from 'package:tibble':
#>
#> data_frame
#> The following object is masked from 'package:dplyr':
#>
#> data_frame
# Show the results using any of these function
collect_metrics(iris_results) %>%
arrange(.metric)
#> # A tibble: 6 x 9
#> wflow_id .config preproc model .metric .estimator mean n std_err
#> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <int> <dbl>
#> 1 mod_1_glm Preprocessor… formula multin… accura… multiclass 0.784 50 6.23e-3
#> 2 mod_2_glm Preprocessor… formula multin… accura… multiclass 0.956 50 3.45e-3
#> 3 mod_3_glm Preprocessor… formula multin… accura… multiclass 0.954 50 3.68e-3
#> 4 mod_1_glm Preprocessor… formula multin… roc_auc hand_till 0.922 50 3.14e-3
#> 5 mod_2_glm Preprocessor… formula multin… roc_auc hand_till 0.993 50 7.90e-4
#> 6 mod_3_glm Preprocessor… formula multin… roc_auc hand_till 0.995 50 8.43e-4
rank_results(iris_results, rank_metric = "accuracy")
#> # A tibble: 6 x 9
#> wflow_id .config .metric mean std_err n model preprocessor rank
#> <chr> <chr> <chr> <dbl> <dbl> <int> <chr> <chr> <int>
#> 1 mod_2_glm Preprocessor… accura… 0.956 3.45e-3 50 multin… formula 1
#> 2 mod_2_glm Preprocessor… roc_auc 0.993 7.90e-4 50 multin… formula 1
#> 3 mod_3_glm Preprocessor… accura… 0.954 3.68e-3 50 multin… formula 2
#> 4 mod_3_glm Preprocessor… roc_auc 0.995 8.43e-4 50 multin… formula 2
#> 5 mod_1_glm Preprocessor… accura… 0.784 6.23e-3 50 multin… formula 3
#> 6 mod_1_glm Preprocessor… roc_auc 0.922 3.14e-3 50 multin… formula 3
Created on 2021-01-07 by the reprex package (v0.3.0)
We're looking for feedback so please file issues or suggestions at the GH site.