Specifying range of columns in glm() with many dummy variables

I'm trying to run a logistic regression with dummy variables for each state. Is there an easier way to add these columns to my list of columns in glm() rather than specifying each by name?

For example, something like:

logit1 <- glm(res ~ var1 +  var2 + col[3:35] ... data=mydf)

^ This doesn't work.

Does it help to use . to refer to "all other columns", so your formula can be

res ~ .

For example

df<-data.frame(y = sample(0:1, 50, replace = TRUE),
               x1 = rnorm(50),
               x2 = runif(50, 0, 20),
               x3 = sample(0:1, 50, replace = TRUE))
glm(y ~ ., data = df, family = binomial(link = "logit"))```

Call:  glm(formula = y ~ ., family = binomial(link = "logit"), data = df)

(Intercept)           x1           x2           x3  
   0.167237    -0.043378     0.009484    -0.966465  

Degrees of Freedom: 49 Total (i.e. Null);  46 Residual
Null Deviance:	    68.59 
Residual Deviance: 65.81 	AIC: 73.81
1 Like


After digging around a bit more I was able to find out that R will automatically create a dummy variable when you include a chr variable like "state." I had assumed that I had to generate new dummy variable columns (as in Stata) and then include them in the regression. As a result, all I had to do was include "state" in my list of RHS variables in glm().

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.