Using logistic regression with elastic net penalty in R

I chose the best alpha and lambda values for three different analyses (so I have three different models). My predictors are ordinal, and my outcome is binary. I have selected the best models (two being ridge) and trained them on the training data. I know that I need to predict the models on the test data, but I'm really confused on how to do this with binarized data. See the latter of my code below:

#########################################################################
#For analysis one, select best alpha and lambda values and create model
#########################################################################

#Split predictor variables and DV and convert to as.matrix

train_predictors_1 <- as.matrix(train.data1[ ,-(1)])
test_predictors1 <- as.matrix(test.data1[ ,-(1)])
train_dv1 <- as.matrix(train.data1$state)
test_dv1 <- as.matrix(test.data1$state)

#Develop chosen model - chose binomial family due to binary outcome variable

A1_en <- glmnet(train_predictors_1, train_dv1, family = "binomial", alpha = 0.1, lambda = .05) #Elnet

#########################################################################
#For analysis two, select best alpha and lambda values and create model
#########################################################################

#Split predictor variables and DV and convert to as.matrix

train_predictors_2 <- as.matrix(train.data2[ ,-(1)])
test_predictors2 <- as.matrix(test.data2[ ,-(1)])
train_dv2 <- as.matrix(train.data2$state)
test_dv2 <- as.matrix(test.data2$state)

#Develop chosen model - chose binomial family due to binary outcome variable

A2_en <- glmnet(train_predictors_2, train_dv2, family = "binomial", alpha = 0, lambda = .2) #Ridge

#########################################################################
#For analysis three, select best alpha and lambda values and create model
#########################################################################

#Split predictor variables and DV and convert to as.matrix

train_predictors_3 <- as.matrix(train.data3[ ,-(1)])
test_predictors_3 <- as.matrix(test.data3[ ,-(1)])
train_dv3 <- as.matrix(train.data3$state)
test_dv3 <- as.matrix(test.data3$state)

#Develop chosen model - chose binomial family due to binary outcome variable

A3_en <- glmnet(train_predictors_3, train_dv3, family = "binomial", alpha = 0, lambda = .15) #Ridge

##############################
#Look at coefficients
##############################

#Model 1

enet.coef_1 <- coef(A1_en, lambda = .05, alpha= .1, exact=T)

#Model 2

enet.coef_2 <- coef(A2_en, lambda = .2, alpha = 0, exact=T)

#Model 3

enet.coef_3 <- coef(A3_en, lambda = .15, alpha = 0, exact=T)

##############################################################

in general for typical R models one uses predict() and pass in first the model object and then the data to predict from using newdata = argument

Hi @nirgrahamuk thank you for your response. If possible, could you guide me on a couple of questions:

  • why have I seen some people look at coefficients before using the model on test data? Should I look at coefficients after using the model to predict test data?

  • how do you get p-values for the coefficients?

Try the following to get p values: after you make a model that has a name, (suppose you call it my_model),
run summary function ie summary(my_model) should give coefficients and their p values. Also most people will look at their coefficients before using the model to make sure that at least some of the variables are actually useful (thats the *, **, *** next to the p values). Your gonna wanna make sure you have at least one low p value before you use a model otherwise you may not be predicting anything with accuracy.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.