Dear Community,

Could you help me on retrieving subsets of explanatory variables from „ols_step_best_subset” function from olsrr package?

I would like to create some linear and nonlinear models based on subsets of variables selected by particular criteria (like AIC, BIC, Mallow’s Cp, etc.). To do this I’m using „ols_step_best_subset” function (Variable Selection Methods):

An example for Mallow’s Cp criterion:

model=lm(y~., data=AItraining)

library(olsrr)

SUBSETS<-ols_step_best_subset(model)

CP<-SUBSETS$cp

PRED<-SUBSETS$predictors

CP_MATRIX<-as.matrix(CP)

PRED_MATRIX<-as.matrix(PRED)

CP_VAR<-data.frame(CP_MATRIX,PRED_MATRIX)

library(dplyr)

V_CP<-filter(CP_VAR, CP_VAR$CP_MATRIX == min(CP_VAR$CP_MATRIX))[2]

I’m receiving a response that in this case the subset of explanatory variables is composed of 2 variables (x1 and x12):

V_CP

PRED_MATRIX

1 x1 x12

I’d like to build a model composed of these 2 variables – but I don’t know how to do this automatically?

Say y~x1+x12

At this point I’m doing it manually:

data_train_CP<-as.data.frame(cbind(AItraining$y,AItraining$x1,AItraining$x12))

names(data_train_CP)=c("y","x1","x12")

model_CP=lm(y~., data=data_train_CP)

summary(model_CP)

variables_CP_test<-as.data.frame(cbind(AItesting$x1,AItesting$x12))

names(variables_CP_test)=c("x1","x12")

variables_CP_test

### FORECASTS TRAINING DATASET

predict(model_CP)

### FORECASTS TESTING DATASET

predict(model_CP, newdata=AItesting)

The code works, only if variables are selected manually, I don’t know how to write code to get this model with variables x1 and x12 “automatically” from the “PRED_MATRIX” – could you help me on this one?