Interpreting variable importance for multinomial logistic regression - `nnet::multinom()` and `caret::varImp()`

Hi everyone! This is a question that combines questions about {caret}, {nnet}, multinomial logistic regression, and how to interpret the results of the functions of those packages.

I am trying to calculate and interpret the variable importance of a multinomial logistic regression I built using the multinom() function from the {nnet} R package. I want to measure the variable importance of each predictor variable contributing to the outcome variable, and the documentation of {caret} says that its function varImp() can do that. On the surface the code works in terms of generating some importance values, but what it doesn't do (I think - in the documentation or the function itself) is tell me how these values are calculated or what they actually are.

Here's my attempt at a reprex:

library(tidyverse)
library(nnet)
library(caret)

fit <- multinom(Species ~ ., data = iris) # fit model

varImp(fit)

And this is what I get:

              Overall
Sepal.Length 13.38206
Sepal.Width  24.07817
Petal.Length 37.90455
Petal.Width  18.23298

My question is - what do these numbers mean, or how can I find out what they mean? (I've tried the package documentation) Is there an alternative way where I can get an estimate of the relative variable importance?

Thank you!

(Sorry - I've posted this question once on Stack Overflow but didn't get answer...)

Hi @martinctc, First welcome to the community.

You say that you post the same issue on stack. Can u link it here? See the doc in the faq about cross-post

Now your question. The VarImp here is the sum of absolute value of coef of a variable.

# library(tidyverse)
library(nnet)
library(caret)
#> Le chargement a nécessité le package : lattice
#> Le chargement a nécessité le package : ggplot2

fit <- multinom(Species ~ ., data = iris) # fit model
#> # weights:  18 (10 variable)
#> initial  value 164.791843 
#> iter  10 value 16.177348
#> iter  20 value 7.111438
#> iter  30 value 6.182999
#> iter  40 value 5.984028
#> iter  50 value 5.961278
#> iter  60 value 5.954900
#> iter  70 value 5.951851
#> iter  80 value 5.950343
#> iter  90 value 5.949904
#> iter 100 value 5.949867
#> final  value 5.949867 
#> stopped after 100 iterations

fit
#> Call:
#> multinom(formula = Species ~ ., data = iris)
#> 
#> Coefficients:
#>            (Intercept) Sepal.Length Sepal.Width Petal.Length Petal.Width
#> versicolor    18.69037    -5.458424   -8.707401     14.24477   -3.097684
#> virginica    -23.83628    -7.923634  -15.370769     23.65978   15.135301
#> 
#> Residual Deviance: 11.89973 
#> AIC: 31.89973

Created on 2020-02-27 by the reprex package (v0.2.1)

For instance:
For Sepal.length his importance ( 13.38206) is the sum of abs(-5.458424)and abs(-7.923634).

Hope it help

The method used in caret (and vip IIRC) is based on a paper by Gevrey et al (2003) for neural networks that uses weighted averages of the model coefficients.

@Rodrigue's answer is spot-on

Thanks @Rodrigue ! Good point noted on the cross-posting, I've now added the link to the original Stack Overflow post; I did let 23 days elapse before coming here - had no intention of spamming at all! Will watch out for this in the future.

Thanks for answering my question, that's really helpful!!

Hi @martinctc

If its ok for you can you mark the answer. People with a similar question can quickly see what worked for you. :heavy_check_mark: :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.