bagEarth prefers dataframes over tibbles


#1

I have been using a MARS model with the earth package and I was feeling pretty good about it until I read that the rank of importance of the predictor variables can be somewhat arbitrary, but not to worry, we can average out the variance of the variable importance using caret::bagEarth. But I am getting an error.

This works great:

earth(flux_x_pro_mat, flux_y, nfold = 4, ncross = 10, keepxy = TRUE)    

This is not working:

 bagEarth(flux_x_pro_mat, flux_y, B = 50)  

Error in terms.formula(formula, data = data) :
duplicated name 'total_N2O' in data frame using '.'

"total_N2O" is the column name of flux_y, a one-column dataframe.

Any chance this is a bug, or should I just try harder?


#2

I think that it should work but would try passing flux_y as a vector.


#3

Thanks, Max, that seems so obvious now. I had already done it for the predictors. Vector didn't exactly work (was actually a tibble problem), but this did: flux_y<-as.data.frame(unclass(flux_y)) .