Is step_lincomb() a real multicollinearity-filter, involving outcomes and predictors?

Is step_lincomb() a real multicollinearity-filter, involving outcomes and predictors? I mean in the sense to reliably avoid the Dummy Variable Trap...

https://www.learndatasci.com/...

'What is the Dummy Variable Trap? The Dummy Variable Trap occurs when two or more dummy variables created by one-hot encoding are highly correlated (multi-collinear) . This means that one variable can be predicted from the others, making it difficult to interpret predicted coefficient variables in regression models.'

...Further more, I mean the case, if a predictor x, is too highly correlated to the outcome y. But this would mean, I would have to include the predictor in step_lincomb() to analize the correlation between outcome and predictors and maybe loose it or?

In other words... is step_lincomb() the convenient successor of info <- car::vif(myModel)?

Toy can just use step_dummy() to get dummy variables without the collinearity. You have to opt-in to getting a one-hot encoding.

step_lincomb() will look for relationships between whatever you give it, so maybe use step_lincomb(all_numeric_predictors()) and it will be fine.

OK, I dont know who Toy is .-), but what is the difference then between step_lincomb() and step_corr(). Is it only the threshold-option?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.