This is somewhat inspired by what people commonly do on Kaggle and the
optuna (in Python)
LightGBMTunerCV function. For something like xgboost it's a pretty sensible approach to start with some sensible default hyperparameters and then tune hyperparameters (assessed by cross-validation, probably using a really high number of maximum trees + early stopping) in a sensible order starting with the ones for which the best choice least depends on the best choice for other hyperparameters. E.g. an order like this: First the depth of the trees, then how many records you subsample per tree, then the minimum child weight, and then how much you sub-sample predictor columns. You might even do this with a high learning rate and then for the final hyperparameter values re-do the CV assessment to find a good number of trees, before your final refit. That gets to relatively decent hyperparameters decently fast.
Now, my question was how would I automate this with
tune? The first sweep over the first hyperparameter is easy to set-up, but is there an in-buildt function (or one that I should use to define a generic strategy) that then easily let's me select the "best" hyperparameter value from that first sweep as the value for that hyperparameter before I then sweep over the next hyperparameter?
I.e. let's say I tried tree depths from 2 to 32 (while keeping all other hyperparameters at my chosen defaults) and 12 seems like a really good value, now I want to change the default for the tree depth to 12 and try different values for the proportion of records I subsample for each tree (e.g. between 0.2 to 1.0).
I.e. this is really an iterative sequence of doing a grid search in one dimension followed by some basic if-then-else-logic for setting up the next grid search, but ideally this would all fit into the tune framework. Does a solution for this already exist? I did not spot a function that seems to be intended for this, but perhaps I overlooked it.