I am using DVC to organise and run a pipeline to train multiple models based on a yaml-based configuration. These models go through all stages in the pipeline, including using the same preprocessing steps.
Now, I want to set which features each model uses in the configuration.
A low-friction solution is to use
step_select() and just keep the features selected in the configuration. The downside with this approach is that the recipe will go through every step in production, even if some models will only use a subset of those features, which is not very efficient.
So my question is, is there an expected/idiomatic way to disable steps after the recipe is created but before it's used in a workflow for tuning/training?
My current guess is to somehow have
skip = TRUE as default when creating the recipe, and then switching it to
skip = FALSE for those the model needs somehow.