Consider a data set
- one categorical variable with 4 levels
- and 10 continuous numerical variables.
Require that the models be linear regressions of some sort (OLS, Elastic Net, Partial Least Squares).
My clients often ask me to create a separate model for each level of the categorical variable because they believe the behavior of the numerical variables is completely different within each level.
I explain that it's better to create one model with interaction terms to address this.
They counter that the models with interaction terms are difficult to interpret and too complex. It's a fair point.
How would you handle this situation? Are there scenarios where it's better to create models on the subsets.