I have been changing the way I model to the tidymodels
way. One difficulty I am having is that I have not seen much documentation on how to use this framework for hyperparameter tuning.
One tutorial I like is Max Kuhn's Optimization Methods for Tuning Predictive Models, but this tutorial uses caret.
Also, the dials documentation goes over building grids of parameters, but this does not really go into detail on non-linear optimization like the above tutorial.
I have some starter code for Nelder-Mead optimization below, but it only hits one set of parameters.
How can I use the tidymodels
framework and do optimization?
# load libs and data
library(caret)
library(tidymodels)
data(Sacramento)
str(Sacramento)
#> 'data.frame': 932 obs. of 9 variables:
#> $ city : Factor w/ 37 levels "ANTELOPE","AUBURN",..: 34 34 34 34 34 34 34 34 29 31 ...
#> $ zip : Factor w/ 68 levels "z95603","z95608",..: 64 52 44 44 53 65 66 49 24 25 ...
#> $ beds : int 2 3 2 2 2 3 3 3 2 3 ...
#> $ baths : num 1 1 1 1 1 1 2 1 2 2 ...
#> $ sqft : int 836 1167 796 852 797 1122 1104 1177 941 1146 ...
#> $ type : Factor w/ 3 levels "Condo","Multi_Family",..: 3 3 3 3 3 1 3 3 1 3 ...
#> $ price : int 59222 68212 68880 69307 81900 89921 90895 91002 94905 98937 ...
#> $ latitude : num 38.6 38.5 38.6 38.6 38.5 ...
#> $ longitude: num -121 -121 -121 -121 -121 ...
# make objective function
svm_obj <- function(param){
## Create split of data
set.seed(955)
ctrl <- rsample::vfold_cv(Sacramento, v = 10, repeats = 1, strata = NULL)
# define SVM model object
get_svm_model <- function(data){
svm_rbf(mode = "regression", cost = 10^(param[2]), rbf_sigma = 10^(param[2]))%>%
set_engine("kernlab") %>%
fit(price ~ ., data = data)
}
get_preds <- function(asses_dat ,model_obj, ...) {
price <- asses_dat %>% select(price)
pred <- predict(model_obj, new_data = asses_dat %>% select(-price))
cbind(price, pred)
}
asses_model <- ctrl %>%
mutate(asses = map(.$splits, assessment)) %>%
mutate(anal = map(.$splits, analysis)) %>%
mutate(svm_mod = map(anal, get_svm_model)) %>%
mutate(svm_pred = map2(asses, svm_mod, get_preds))
map_df(asses_model$svm_pred, yardstick::mae, price, .pred) %>%
summarise(mean(.estimate)) %>%
pull()
}
## Nelder-Mead
set.seed(45642)
num_mods <- 3
nm_res <- optim(par = c(0, 0), fn = svm_obj, method = "Nelder-Mead",
control = list(maxit = num_mods))
nm_res$par
#> [1] 0 0
nm_res$value
#> [1] 83742.79
Created on 2018-12-09 by the reprex package (v0.2.1)