Hi! I have a data set with a binary (0,1) response and both continuous and categorical predictors. I would like to test the overall importance of these predictors one by one, and I am looking for suggestions on how to do this. I am not trying to select an overall model. I am just trying to see which variables are individually significantly associated to the response.
Here is an example concerning categorical variables... Let's call the response "y". Let's say my categorical predictor is called "xCat" and has levels A, B, C, and D. I would like to test if xCat has a statistically significant association with y. I want to test for overall significance, not just significant differences from a single reference group. Here is what I have tried so far...
Option A using LRT:
fitCat <- glm(y~xCat, family = binomial(link = "logit"), data = data) fit0 <- glm(y~1, family = binomial(link = "logit"), data = data) anova(fit0, fitCat, test="LRT")
Option B using drop1:
fitCat <- glm(y~xCat, family = binomial(link = "logit"), data = data) drop1(fitCat, .~., test = "Chisq")
I would then look at p-values for either of these outputs. If the p-value is >0.05, then I would say there is not significant association. Is this actually testing what I think it is testing? I am concerned about violating assumptions of normality and equal variance for the ANOVA. Any comments or suggestions?
Here is an example of the continuous predictors... Let's call the continuous predictor "xCon". Here is what I've tried...
fitCon <- glm(y~xCon, family = binomial(link = "logit"), data = data) summary(fitCon)
I would then look at the p-value from the output. If the p-value is >0.05, then I would say there is not significant association. Is there anything I'm missing here? assumptions I need to check or common pitfalls? Let me know if you have suggestions! Thanks!