No change in the coefficients of my time-specific variables of interest when controlling for demographic effects


Ciao Guys,

I have a balanced panel set and I study the following model:

lm(value ~ taskXjan20 + taskXfeb20 + taskXmar20 + taskXapr20 + demographic_criteria),

where value is a binary variable, being 1 if individual i has been unemployed in the previous month and 0 otherwise. It expresses therefore the probability of getting unemployed. I would like to explain this probability with the task group an individual is assigned to (which is a binary variable as well: 1 for the specific task group). This variable for a specific task group is interacted with the specific month (obviously being a binary variable as well). However, as I want to add demographic criterias as control variables (in total quite a lot ranging from AGE to Education,...) I do not see any difference for my coefficients.

Further, as I try to run a regression like that:

lm(value ~ task + taskXface_to_face + taskXremote + taskXessential + demographic_criteria),

where the interactors of my task variable are now time-independent variables, I see severe changes as I control them for my demographic effects.

My question now is, did I forget to adjust my regression for any further trends if my variables of interest directly interact with time-specific variables? And if yes, why did I not have to include them into the 2nd regression?

Hope somebody can help me!

Many thanks in advance, Freddy

"I do not see any difference for my coefficients."

Can you explain this more? I'm not sure what you mean with this sentence.

A general point is that you might be asking a lot of your data to find all those interactions. You might consider a main effects model first. And checking the VIF for all your variables. Upon adding the interactions, your might have some VIFs with very high values. In that case, you have to scale back to a simpler model.

maybe you can use the plm package that will help you control for individual and time effects as you use FE/RE models. Also you probably want to use logit/probit regression since your dependent variable is a dummy. you can do that with the pglm package.

Maybe it was not precise enough explained.
My issue is the following: I want to obtain the interaction of a dummy variable (the task group of an individual) with a time dummy variable. Somethin like: y = taskgroup1march2020 + taskgroup1april2020.

If I incorporate now demographic control variables such as age or gender, the coefficients of (in this case) both interaction variables do not change.

Whenever I try do replace the time dummies by non-binary variables, I am able to obtain different results when controling for demographic effects.

I really don't know why this is the case..

Many thanks in advance

I see. The coefficients should change at least a little bit. I wonder if the demographic variables were dropped from the regression due to singularities. Did you get any warnings about singularities or NA values in the coefficients?

No no I did not get any warnings. Well the coefficients change slightly, but on the 5th decimal place..
But not comparable to the way it should be (or how it has been done by previous researchers)..
many thanks

Could you provide a reprex ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.