Hi Folks,
I've figured it out thanks to a few posts, very glad for Rebecca Barter's detailed tutorial regarding variable importance of another model class random forests (See ref 2 below).
Outlining the main steps here but please review the links at the end for detail for why it was done this way.
1. Get Your Final Model
set.seed(2020)
# Assuming kernlab linear SVM
# Grid Search Parameters
tune_rs <- tune_grid(
model_wf,
train_folds,
grid = param_grid,
metrics = classification_measure,
control = control_grid(save_pred = TRUE)
)
# Finalise workflow with the parameters for best accuracy
best_accuracy <- select_best(tune_rs, "accuracy")
svm_wf_final <- finalize_workflow(
model_wf,
best_accuracy
)
# Fit on your final model on all available data at the end of experiment
final_model <- fit(svm_wf_final, data)
# fit takes a model spec and executes the model fit routine (Parsnip)
# model_spec, formula and data to fit upon
2. Extract the KSVM Object, Pull Required Info, Calculate Variable Importance
ksvm_obj <- pull_workflow_fit(final_model)$fit
# Pull_workflow_fit returns the parsnip model fit object
# $fit returns the object produced by the fitting fn (which is what we need! and is dependent on the engine)
coefs <- ksvm_obj@coef[[1]]
# first bit of info we need are the coefficients from the linear fit
mat <- ksvm_obj@xmatrix[[1]]
# xmatrix that we need to matrix multiply against
var_impt <- coefs %*% mat
# var importance
Ref:
-
Extracting the Weights of Support Vectors using Caret: https://stackoverflow.com/questions/56515373/linear-svm-and-extracting-the-weights?noredirect=1&lq=1
-
Variable Importance (Last Section of this post): http://www.rebeccabarter.com/blog/2020-03-25_machine_learning/#finalize-the-workflow