 # How to keep significant variable results within a lapply formula for univariable analysis

Dear Braintrust,
I'm facing an interesting (at least for me ) challenge (especially because I'm not used with wide datasets).
My dataset (my_data) has 50 potentials covariates (X's that can either be numerical or factors).

I'm interested to do first a univariable analysis using a Poisson model (my dependent variable Y is a count) then to automatically select variables that would go to the next step of multivariable modelling

I've created my vector of interesting X's (u_var) to pass through univariable analysis

``````lapply(u_var, #u_var is the vector of X names
function(var) {
formula    <- as.formula(paste("Y ~", var, "+offset(n))"))  #I specify my Poisson model with an offset
res.pois.uni <- glm(formula, data = my_data, fam = poisson(link = log))
summary(res.pois.uni)
})
``````

I then obtain my 50 univariable models.
However, I would be interested to only keep those which have a P-value below a specific threshold (in my field of research commonly used 0.15 to .2)
Then I want to use the remaining variables to use them for multivariable modeling.
is there a way to specify it simply during my lapply command?

I would be very interested to then only keep variable of interest I could put to backward elimination strategy

Something along the following lines

``````bogie <- 0.15
make_bogie <- function() summary(glm(cyl ~ mpg, data = mtcars, family = poisson(link = "log")))\$coefficients[2,4] <= bogie

make_bogie()
#>  TRUE
``````

is all I can offer without a `reprex`. See the FAQ: How to do a minimal reproducible example `reprex` for beginners.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.