Hi, and welcome to the community and, as well, to the wonderfully wacky world of logistic regression. I'm in the middle of unpacking, so I can just give you the view from 40,000 feet tonight.
The typical logistic regression model is in the form
glm(y \tilde{} x_i + ... x_n)
There are four steps in evaluating a logistic model.
-
Selection of the parameters. There are several ways to do this. One is to use a saturated model with all of the available independent variables. For a given \alpha, the x terms that have a p-value greater than \alpha are successively discarded from the model.
-
Calculation of odds ratio.
odr <- function(x) {
exp(cbind(OR = coef(x), confint(x)))
}
This gives an indication whether observing x makes observing y more likely (OR > 1), less likely (OR < 1) or equally likely (OR = 1), and allows testing whether the OR falls within a given two-sided confidence interval.
-
Next comes a goodness of fit test, such as Hosmer-Lemeshow goodness of fit, which has a null hypothesis H_0, that the fit is poor; accordingly a high p-value is evidence of a good fit. The generalhoslem package will produce a test statistic with the hoslem.test function. It also provides tables of expected and observed frequencies.
-
If the stars align, the final step of model diagnostics may not be needed.
The standard text is Hosmer DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression, 3rd Edition. 2013. New York, USA: John Wiley and Sons.