Hey all -
Have been working on trying to extend parsnip with the rotationForest implementation on GitHub by mananshah99 (https://github.com/mananshah99/rotationforest) - there's another CRAN one, I think, but it has some weird implementation details. Mostly as an exercise to get a feel for how to do this, but - I almost have it working - the fit and predict methods work just fine, but as soon as I try to pass it into tune_grid I get the following error:
Error: Problem with `mutate()` input `object`.
x Error when calling rotationForest(): Error in BuildModel(xdf, ydf, npredictor, ...) :
argument "xdf" is missing, with no default
ℹ Input `object` is `purrr::map(call_info, eval_call_info)`.
Digging under the hood with debug() and :::, it looks like the failure mode is the call to parameters(), which works fine on a model with no tune() parameters:
Collection of 0 parameters for tuning
[1] id parameter type object class
<0 rows> (or 0-length row.names)
But doesn't like it when it does have them. I think my problem is that I need to wrap the fit method for rotationForest in a way that will return values that play nicely, but reading over the vignettes I just can't seem to get my head around how to do so. I'm sure I'm just about three metres off the track, so any advice would be much appreciated! Setup code below:
library(tidymodels); library(rotationForest);
# Setup -------------------------------------------------------------------
set_model_mode(model = "rand_forest", mode = "classification")
set_model_engine(
"rand_forest",
mode = "classification",
eng = "rotationForest"
)
set_dependency("rand_forest", eng = "rotationForest", pkg = "rotationForest")
set_model_arg(
model = "rand_forest",
eng = "rotationForest",
parsnip = "mtry",
original = "npredictor",
func = list(pkg = "rotationForest", fun = "rotationForest"),
has_submodel = FALSE
)
set_model_arg(
model = "rand_forest",
eng = "rotationForest",
parsnip = "trees",
original = "ntree",
func = list(pkg = "rotationForest", fun = "rotationForest"),
has_submodel = FALSE
)
set_fit(
model = "rand_forest",
eng = "rotationForest",
mode = "classification",
value = list(
interface = "data.frame",
protect = c("xdf", "ydf"),
func = c(pkg = "rotationForest", fun = "rotationForest"),
defaults = list()
)
)
class_info <-
list(
pre = NULL,
post = NULL,
func = c(fun = "predict"),
args =
list(
rotationForestObject = quote(object$fit),
dependent = quote(new_data)
)
)
set_pred(
model = "rand_forest",
eng = "rotationForest",
mode = "classification",
type = "class",
value = class_info
)
set_encoding(model = "rand_forest",
mode = "classification",
eng = "rotationForest",
options = list(predictor_indicators = "none",
compute_intercept = FALSE,
remove_intercept = FALSE))
# Testing -----------------------------------------------------------------
data("two_class_dat", package = "modeldata")
set.seed(4622)
example_split <- initial_split(two_class_dat, prop = 0.7)
example_train <- training(example_split)
example_test <- testing(example_split)
bs_train <- bootstraps(example_train)
### this works just fine:
model <- rand_forest(trees = 100, mtry = 1, mode = "classification") %>%
set_engine(engine = "rotationForest")
rot_for_fit <- model %>%
fit(Class ~ ., data = example_train)
predict(rot_for_fit, new_data = example_train)
## this call works just fine as well:
parameters(model)
## but when we try something more complicated
recipe_rot <- recipe(Class ~ ., data = example_train) %>%
step_normalize(all_predictors())
model_grid <- rand_forest(trees = tune(), mtry = tune(), mode = "classification") %>%
set_engine(engine = "rotationForest")
wf <- workflow() %>%
add_recipe(recipe_rot) %>%
add_model(model_grid)
grid_rot <- grid_random(finalize(mtry(), x = example_train), trees(), size = 3)
## it fails here:
tune_grid(wf, resamples = bs_train, grid = grid_rot)
## this call also fails:
parameters(model_grid)