I wanted to get suggestions/resources on choosing the right model for my data.
I have an unbalanced panel data (df) with a bivariate outcome (0 or 1). The predictive variables (lagged) are mix of time-varying (e.g number of children/year) and time-invariant variables (e.g ethnicity) and are categories (converted into factor) as well as continuous. I have structured the data in long format with the first column as indicator(individual ID) and the second column as time (year) followed by other variables.
I have been looking at the glm, plm functions to do this. I am super new to this and have just been exploring so wanted to get your thoughts.
This are the things what I am trying
#with glm Modelglm <- glm(outcome ~ Var1 + Var2 + Var3, family = binomial(link="logit", data = df)
#with plm fixed effect ModelplmFE <- plm(outcome ~ Var1 + Var2 + Var3, data = df, model ="within") #with plm random effect ModelplmRE <- plm(outcome ~ Var1 + Var2 + Var3, data = df, model ="random")
I have also been looking at individual and time effects within both random and fixed
ModelplmITE_RE <- plm(outcome ~ Var1 + Var2 + Var3, data = df, effect = "twoways", model = "random")
Also should I strictly work with plm instead of glm since I have a panel data?