One compresensive model with categorical variables & interactions vs. models on subsets of the data

Consider a data set

  • one categorical variable with 4 levels
  • and 10 continuous numerical variables.

Require that the models be linear regressions of some sort (OLS, Elastic Net, Partial Least Squares).

My clients often ask me to create a separate model for each level of the categorical variable because they believe the behavior of the numerical variables is completely different within each level.

I explain that it's better to create one model with interaction terms to address this.

They counter that the models with interaction terms are difficult to interpret and too complex. It's a fair point.

How would you handle this situation? Are there scenarios where it's better to create models on the subsets.

1 Like

Great question. Iā€™m new to modeling and I am just starting to play with interactions and I learned a lot from your question! That said, how about using multiple models to help explain the model with interactions so that next time a model with interactions will be more readily understood? The interaction terms are a bit confusing.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.