Hi,
I am currently looking a CART trees in relation to variable importance
In the documentation for caret there is a function called varimp
Depending on the model it differs on how it calculates the variable importance
It also says for rpart
that This method does not currently provide class--specific measures of importance when the response is a factor
When I create a rpart
below I am able to use varimp
. Can anyone tell me how this is calculated. Is it based on the drop in Gini Index when the variable is permutated or dropped?
Thanks for your time
library(rpart)
library(caret)
#> Warning: package 'caret' was built under R version 3.5.1
#> Loading required package: lattice
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 3.5.1
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.5.1
#> Warning: package 'dplyr' was built under R version 3.5.1
# Get the Data
data(GermanCredit)
rf_mod <- rpart(Class~.,data = GermanCredit)
caret::varImp(rf_mod) %>%
rownames_to_column() %>%
arrange(desc(Overall)) %>%
slice(1:10)
#> rowname Overall
#> 1 Amount 57.82419
#> 2 Duration 47.32593
#> 3 CheckingAccountStatus.none 43.66521
#> 4 CheckingAccountStatus.lt.0 37.87057
#> 5 CreditHistory.Critical 24.11095
#> 6 Purpose.NewCar 20.31030
#> 7 Purpose.UsedCar 18.56253
#> 8 CheckingAccountStatus.gt.200 17.39552
#> 9 OtherDebtorsGuarantors.Guarantor 11.40171
#> 10 Property.Unknown 11.25420
Created on 2018-11-21 by the reprex package (v0.2.1)