Does the the type of dummy variable coding matter for regression trees?

I sometimes use contr.sum factor coding to make the main effect term more intuitive in linear regression models with interaction terms.

Is there any benefit to using contr.sum coding in regression tree (xgboost) models? Does it have any effect in how variable importance or Shap values are interpreted?

https://stats.idre.ucla.edu/r/library/r-library-contrast-coding-systems-for-categorical-variables/

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.