# Interpreting variable importance for multinomial logistic regression - `nnet::multinom()` and `caret::varImp()`

Hi everyone! This is a question that combines questions about {caret}, {nnet}, multinomial logistic regression, and how to interpret the results of the functions of those packages.

I am trying to calculate and interpret the variable importance of a multinomial logistic regression I built using the `multinom()` function from the {nnet} R package. I want to measure the variable importance of each predictor variable contributing to the outcome variable, and the documentation of {caret} says that its function `varImp()` can do that. On the surface the code works in terms of generating some importance values, but what it doesn't do (I think - in the documentation or the function itself) is tell me how these values are calculated or what they actually are.

Here's my attempt at a reprex:

``````library(tidyverse)
library(nnet)
library(caret)

fit <- multinom(Species ~ ., data = iris) # fit model

varImp(fit)
``````

And this is what I get:

``````              Overall
Sepal.Length 13.38206
Sepal.Width  24.07817
Petal.Length 37.90455
Petal.Width  18.23298
``````

My question is - what do these numbers mean, or how can I find out what they mean? (I've tried the package documentation) Is there an alternative way where I can get an estimate of the relative variable importance?

Thank you!

Now your question. The VarImp here is the sum of absolute value of coef of a variable.

``````# library(tidyverse)
library(nnet)
library(caret)
#> Le chargement a nécessité le package : lattice
#> Le chargement a nécessité le package : ggplot2

fit <- multinom(Species ~ ., data = iris) # fit model
#> # weights:  18 (10 variable)
#> initial  value 164.791843
#> iter  10 value 16.177348
#> iter  20 value 7.111438
#> iter  30 value 6.182999
#> iter  40 value 5.984028
#> iter  50 value 5.961278
#> iter  60 value 5.954900
#> iter  70 value 5.951851
#> iter  80 value 5.950343
#> iter  90 value 5.949904
#> iter 100 value 5.949867
#> final  value 5.949867
#> stopped after 100 iterations

fit
#> Call:
#> multinom(formula = Species ~ ., data = iris)
#>
#> Coefficients:
#>            (Intercept) Sepal.Length Sepal.Width Petal.Length Petal.Width
#> versicolor    18.69037    -5.458424   -8.707401     14.24477   -3.097684
#> virginica    -23.83628    -7.923634  -15.370769     23.65978   15.135301
#>
#> Residual Deviance: 11.89973
#> AIC: 31.89973
``````

For instance:
For `Sepal.length` his importance ( `13.38206`) is the sum of `abs(-5.458424)`and `abs(-7.923634)`.

Hope it help

The method used in `caret` (and `vip` IIRC) is based on a paper by Gevrey et al (2003) for neural networks that uses weighted averages of the model coefficients.

@Rodrigue's answer is spot-on

Thanks @Rodrigue ! Good point noted on the cross-posting, I've now added the link to the original Stack Overflow post

Thanks for answering my question, that's really helpful!!

Thanks for answering my question, that's really helpful!!

