Is there an R package for SHAP visualization compatible with tidymodels? I have tried SHAPforxgboost
, fastshap
, and shapviz
. Due to the ML model development is based on tidymodels grammar, I don't know how to use these packages with the tidymodels object. Is there any plan for developing an model explanation tool for tidymodels?
It depends on the model type. For general models, you can work (my) model agnostic kernelshap package to calculate SHAP values and then plot them via (my) shapviz package:
library(tidymodels)
library(kernelshap)
library(shapviz)
iris_recipe <- iris %>%
recipe(Sepal.Length ~ .)
reg <- linear_reg() %>%
set_engine("lm")
iris_wf <- workflow() %>%
add_recipe(iris_recipe) %>%
add_model(reg)
fit <- iris_wf %>%
fit(iris)
shap <- kernelshap(fit, iris[, -1], bg_X = iris) %>%
shapviz()
sv_importance(shap, kind = "bee")
sv_dependence(shap, "Petal.Length")
If your model is fitted with XGBoost/LightGBM backend, no "kernelshap" package is required. Then, "shapviz" will suffice. I'd need a reproducible model example from your side to provide a solution.
Thank you so much for your reply. Here is the code of xgboost
and lightGBM
modeling using tidymodels.
library(tidyverse)
library(bonsai)
library(tidymodels)
set.seed(123)
split <- initial_split(iris, prop = 0.7,strata = Species)
train <- training(split)
test <- testing(split)
cv <- vfold_cv(train, strata = Species,v = 10)
model_recipe <-
recipe(Species ~ ., data = train)
# xgboost
xgboost_model <-
boost_tree( mode = "classification",
mtry = 5,
trees = 1000,
min_n = 4,
tree_depth = 5,
learn_rate = 0.05,
sample_size = 0.7,
engine = "xgboost"
)
xgboost_wf <-
workflow() %>%
add_model(xgboost_model) %>%
add_recipe(model_recipe) %>%
fit(train)
# lightGBM
lgbm_model <-
boost_tree( mode = "classification",
mtry = 3,
trees = 500,
min_n = 15,
tree_depth = 5,
learn_rate = 0.03,
loss_reduction = 0,
engine = "lightgbm"
)
lgbm_wf <-
workflow() %>%
add_model(lgbm_model) %>%
add_recipe(model_recipe) %>%
fit(train)
I don't know how to combine the shapviz
package with the tidymodels object. Can you show the detailed code for visualizing the force plot, variable importance plot, and dependence plot based on the code above?
Also the multilayer perceptron
model in tidymodels.
mlp_model <-
mlp(mode = "classification",
hidden_units = 8,
penalty = 0.3,
epochs = 500
engine = "nnet"
)
mlp_wf <-
workflow() %>%
add_model(mlp_model) %>%
add_recipe(model_recipe) %>%
fit(train)
Unfortunately, I don't see a direct way to extract TreeSHAP values from XGBoost.
Edit: Here is an example of how to do it
https://lorentzen.ch/index.php/2023/01/27/shap-xgboost-tidymodels-love/
For the MLP (and actually any other model that predicts numbers), you can work with my new package "kernelshap". kernelshap()
returns SHAP values for all three categories, but to plot them with "shapviz", we need to focus on one category.
library(tidyverse)
library(bonsai)
library(tidymodels)
set.seed(123)
split <- initial_split(iris, prop = 0.7,strata = Species)
train <- training(split)
test <- testing(split)
cv <- vfold_cv(train, strata = Species,v = 10)
model_recipe <-
recipe(Species ~ ., data = train)
mlp_model <-
mlp(mode = "classification",
hidden_units = 5,
penalty = 0.3,
epochs = 50,
engine = "nnet"
)
mlp_wf <-
workflow() %>%
add_model(mlp_model) %>%
add_recipe(model_recipe) %>%
fit(train)
library(kernelshap)
library(shapviz)
library(withr)
with_seed(
1,
background_data <- train[sample(nrow(train), 50), ]
)
predict(mlp_wf, head(train, 1), type = "prob")
#
# .pred_setosa .pred_versicolor .pred_virginica
# <dbl> <dbl> <dbl>
# 1 0.564 0.221 0.215
# List with SHAP value matrices (one matrix per class)
shap_values <- kernelshap(mlp_wf, train, bg_X = background_data, type = "prob")
# Turn into shapviz -> select "virginica"
sv <- shapviz(shap_values, which_class = 3)
sv_importance(sv, kind = "bee")
sv_dependence(sv, "Sepal.Width", color_var = "auto")
sv_force(sv, row_id = 1)
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.