Choosing models for unbalanced panel data in R

Hi community,

I wanted to get suggestions/resources on choosing the right model for my data.

I have an unbalanced panel data (df) with a bivariate outcome (0 or 1). The predictive variables (lagged) are mix of time-varying (e.g number of children/year) and time-invariant variables (e.g ethnicity) and are categories (converted into factor) as well as continuous. I have structured the data in long format with the first column as indicator(individual ID) and the second column as time (year) followed by other variables.

I have been looking at the glm, plm functions to do this. I am super new to this and have just been exploring so wanted to get your thoughts.
This are the things what I am trying

#with glm
Modelglm <- glm(outcome ~ Var1 + Var2 + Var3, family = binomial(link="logit", data = df)
#with plm fixed effect
ModelplmFE <- plm(outcome ~ Var1 + Var2 + Var3, data = df, model ="within")
#with plm random effect
ModelplmRE <- plm(outcome ~ Var1 + Var2 + Var3, data = df, model ="random")

I have also been looking at individual and time effects within both random and fixed

ModelplmITE_RE <- plm(outcome ~ Var1 + Var2 + Var3, data = df, effect = "twoways",
                  model = "random")

Also should I strictly work with plm instead of glm since I have a panel data?
Thank you.

The model needs to be able to accommodate a binary outcome. Autoregssion of some of the treatment variables may present a complication requiring application of time series analysis, but that shouldn't be assumed. The {pglm} advertises itself as being able to apply glm models to panel data.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.