MARS model sensitivity to multicolinearity.

rafanadal1986 · December 5, 2020, 12:49am

Hello guys,
Is the MARS model sensitive to colinearity or multicollinearity between variables?
If so, is there a proper way to detect multicollinearity as the case is with VIF in linear models?

Dave_Armstrong · December 5, 2020, 1:26pm

If you think about how the MARS model works, it does the following:

Create and include pairs of hinge functions on the forward pass.
Prune individually on the backward pass.

There shouldn't be any potentially for collinearity among a pair of hinge functions on the forward pass because of the way that they're made. One side will be zero when the other is non-zero. There could be collinearity across hinge functions (i.e., for one side of a hinge for two different variables), but I would think one of those would get pruned out on the backward pass because its deletion from the model wouldn't result in an interesting loss of fit (particularly if correlations were really high). If you chose not to prune the hinges (I'm not sure why you would), then there are possibilities for collinearity I suppose. In a sense, it might mean that the model is sensitive to collinearity in that given the ordering of the variables in the backward pass, it might choose a different hinge to prune, but the predictions it makes should not be sensitive to collinearity. If you wanted to see what collinearity looked like for the model, you could make the design matrix and use conventional collinearity diagnostics on it.

library(earth)
mod <- earth(Volume ~ ., data = trees)
mf <- model.frame(formula(mod), data=trees)
X <- model.matrix(mod, mf)[,-1]
r2j <- sapply(1:ncol(X), function(i)summary(lm(X[,i] ~ X[,-i]))$r.squared)
tol <- 1-r2j
vif <- 1/tol
names(vif) <- colnames(X)
vif
# h(Girth-14.2) h(14.2-Girth)  h(Height-75) 
#      1.683813      1.475042      1.333748

These are just my impressions based on how the model works. I'm happy to hear other thoughts, too.

rafanadal1986 · December 7, 2020, 5:23am

Thank you for the elaborated response, Dave.
I have a question, does pruning the hinges happen after performing k-fold cross-validation?

Dave_Armstrong · December 7, 2020, 1:16pm

Cross-validation is among the ways that the model can perform the pruning - removing individual hinges that do not increase the cross-validated RSE by some pre-specified amount. So, the pruning happens during (i.e., as the main goal of) the cross-validation. You could also use cross-validation to tune the other pieces of the model (e.g., nk and degree), but you would have to set that up manually and that decision would be independent of the decision of whether or not you would use cross-validation to prune the hinges.

rafanadal1986 · December 10, 2020, 9:31pm

Thanks, Dave.
I am new to this MARS algorithm.
What is the manual method or formula to prune hinges?
What is interesting to me is, how can one avoid pruning the hinges?

Max · December 11, 2020, 2:01pm

This document is a great resource to learn more about the details of the algorithm and the model itself.

I don't know that you can totally stop all pruning but using pmethod = "none" is the best method.

system · January 1, 2021, 2:01pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.