This is more of a statistics theory question rather than R question in specific, but I hope some kind soul will help an econometrics and R newbie. I am working in R with the plm package.
I have 10k+ firm-year observations of ~1000 firms with ~2k CEOs over 15 years in an unbalaned panel data. I am looking into the effect of a time-invariant CEO background variable on the firms' investment.
My dependent variable is capex-ratio, with independent variables being among others, time-invariant (for a specific ceo) dummies on CEO's background, sex and education, but also time-variant variables such as CEO age, firm total assets, leverage etc. A single company often has multiple CEOs over the observations, but not always. Thus the CEO time-invariant variables can also be on per firm basis time-invariant.
I am modeling my research on a similar earlier research, where the authors have used fixed industry and year effects (e.g. Henderson et al 2017 "Lawyer CEOs"). I would also like to include same fixed effects, but my understanding is that this omits any time-invariant variables.
I have ran my regressions with plm library on following code (list of control variables shortened), which gives me statistically significant results:
Capex.fe <- plm(Capex.ratio ~ Background.CEO + Education.CEO + Age.CEO + Ln.total.assets + factor(Industry), data=Paneldata, index = (c("Fiscal.Year")), model = "within", effect = "individual")
Am I doing something wrong (code or theorywise)? Or is the use of the time-invariant CEO variables okay, as I am using fixed effects on industry level instead of firm level, where CEO might be same through out period?
Thank you to anyone who can give me advice on this theory issue. As my code runs I have not included example of my dataset or the regression results - however, if that would be helpful happy to do that as well.