Hi,
I am tuning a bagged tree model for a binary outcome:
bagtree_spec<-bag_tree(cost_complexity = tune(),tree_depth = tune(),min_n = tune()) %>%
set_engine("rpart", times=25) %>%
set_mode("classification")
...
bagtree_tune<-bagtree_wf %>%
tune_grid(resamples=folds,
metrics=metric_set(sens,spec,roc_auc),
control=control_grid(save_pred = FALSE),
grid=20)
I get an error message while tuning:
x Fold1: preprocessor 1/1, model 9/20: Error: Input must be a vector, not NULL.
x Fold2: preprocessor 1/1, model 9/20: Error: Input must be a vector, not NULL.
x Fold3: preprocessor 1/1, model 9/20: Error: Input must be a vector, not NULL.
x Fold3: preprocessor 1/1, model 16/20: Error: Input must be a vector, not NULL.
x Fold4: preprocessor 1/1, model 9/20: Error: Input must be a vector, not NULL.
x Fold5: preprocessor 1/1, model 9/20: Error: Input must be a vector, not NULL.
Remarkably, the tuning process does not error out. I searched around and found that cost_complexity needs to be < 1. So, I ran the code again with cost_complexity = 0 and it ran with no message. If both scenarios work, why am I seeing this error message in one case and not the other?
UPDATE: I found that the bagged tree only tuned using 19 values even though I requested 20. I think the tune() option chooses some values outside of the appropriate range for the hyperparameter.
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2012 R2 x64 (build 9600)
Matrix products: default
Random number generation:
RNG: L'Ecuyer-CMRG
Normal: Inversion
Sample: Rejection
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] xgboost_1.3.2.1 caret_6.0-85 lattice_0.20-38 kernlab_0.9-29
[5] rpart_4.1-15 vctrs_0.3.6 rlang_0.4.10 themis_0.1.3
[9] baguette_0.1.0 glmnet_4.1-1 Matrix_1.2-17 NeuralNetTools_1.5.2
[13] keras_2.2.5.0 vip_0.3.2 doParallel_1.0.16 iterators_1.0.12
[17] foreach_1.4.8 yardstick_0.0.7 workflows_0.2.2 tune_0.1.3
[21] tidyr_1.1.3 tibble_3.1.0 rsample_0.0.9 recipes_0.1.15
[25] purrr_0.3.4 parsnip_0.1.5 modeldata_0.1.0 infer_0.5.4
[29] ggplot2_3.3.3 dials_0.0.9 scales_1.1.0 broom_0.7.5
[33] tidymodels_0.1.2 dplyr_1.0.5 skimr_2.1.3 readxl_1.3.1
loaded via a namespace (and not attached):
[1] Cubist_0.2.3 colorspace_1.4-1 ellipsis_0.3.0 class_7.3-15
[5] fs_1.3.2 base64enc_0.1-3 rstudioapi_0.13 farver_2.0.3
[9] listenv_0.8.0 furrr_0.2.2 ParamHelpers_1.14 earth_5.3.0
[13] prodlim_2019.11.13 fansi_0.4.1 mvtnorm_1.1-1 lubridate_1.7.10
[17] codetools_0.2-16 splines_3.6.1 libcoin_1.0-8 knitr_1.28
[21] zeallot_0.1.0 Formula_1.2-4 jsonlite_1.6 pROC_1.16.1
[25] tfruns_1.4 compiler_3.6.1 backports_1.1.5 assertthat_0.2.1
[29] cli_2.3.1 htmltools_0.4.0 tools_3.6.1 partykit_1.2-13
[33] gtable_0.3.0 glue_1.4.0 RANN_2.6.1 reshape2_1.4.4
[37] parallelMap_1.5.0 fastmatch_1.1-0 Rcpp_1.0.4.6 cellranger_1.1.0
[41] DiceDesign_1.9 nlme_3.1-140 timeDate_3043.102 inum_1.0-3
[45] mlr_2.19.0 gower_0.2.1 xfun_0.22 stringr_1.4.0
[49] globals_0.14.0 lifecycle_1.0.0 future_1.21.0 MASS_7.3-51.4
[53] ipred_0.9-9 BBmisc_1.11 C50_0.1.3.1 reticulate_1.15
[57] gridExtra_2.3 TeachingDemos_2.12 stringi_1.4.3 tensorflow_2.0.0
[61] plotrix_3.7-8 checkmate_2.0.0 butcher_0.1.3 lhs_1.1.1
[65] hardhat_0.1.5 lava_1.6.7 shape_1.4.5 repr_1.1.0
[69] pkgconfig_2.0.3 labeling_0.3 tidyselect_1.1.0 parallelly_1.24.0
[73] plyr_1.8.6 magrittr_2.0.1 R6_2.4.1 generics_0.1.0
[77] DBI_1.1.0 pillar_1.5.1 whisker_0.4 withr_2.4.2
[81] survival_2.44-1.1 nnet_7.3-12 ROSE_0.0-3 crayon_1.4.1
[85] unbalanced_2.0 utf8_1.1.4 usethis_1.6.1 grid_3.6.1
[89] data.table_1.12.8 FNN_1.1.3 ModelMetrics_1.2.2.1 plotmo_3.6.0
[93] digest_0.6.25 stats4_3.6.1 GPfit_1.0-8 munsell_0.5.0