Specifying (and interpreting) LMMs using factorial designs with lme4

Hi, I’m trying to specify (and interpret) a LMM using data with the following factorial design:

• Condition (Active/Sham: between-subjects)
• Session (1/2/3: within-subjects)
• nbacklevel (1/2: within-subjects)

I have used dummy coding for now, which I know makes interpretation difficult when there is >1 factor (and/or factors with >2 levels).

My question is, is it important to specify variables as factors when building a LMM? I ask this because I get different outputs when the variables are or are not specified as factors. For example, when I specify Session and nbacklevel as factors I get the following output for the fixed-effects:

Fixed effects: vis_hits ~ Condition * Session + nbacklevel
Value Std.Error DF t-value p-value
(Intercept) 0.9666667 0.11666667 5 8.285714 0.0004
ConditionSham 3.0000000 0.15275252 0 19.639610 NaN
Session2 1.0000000 0.15275252 5 6.546537 0.0012
Session3 1.9500000 0.15275252 5 12.765747 0.0001
nbacklevel2 0.5666667 0.08819171 5 6.425396 0.0014
ConditionSham:Session2 0.0000000 0.21602469 5 0.000000 1.0000
ConditionSham:Session3 -0.2000000 0.21602469 5 -0.925820 0.3970

Here is it correct to interpret that Active is the reference category for Condition, Session 1 is the reference category for Session and nbacklevel 1 is the reference category for nbacklevel? And if so, the coefficient for Session2 would represent the difference between Session 2 and Session 1 for the active group at nbacklevel 1?

However, if session and n-back level aren’t coded as factors I get the following output:

Fixed effects: vis_hits ~ Condition * Session + nbacklevel
Value Std.Error DF t-value p-value
(Intercept) -0.5666667 0.19462010 7 -2.911655 0.0226
ConditionSham 3.1333333 0.21473498 0 14.591630 NaN
Session 0.9750000 0.07028852 7 13.871397 0.0000
nbacklevel 0.5666667 0.08116219 7 6.981904 0.0002
ConditionSham:Session -0.1000000 0.09940298 7 -1.006006 0.3479

Here, again Active is the reference category for Condition, but I can't work out what 'Session' or 'nbacklevel' would show you and what it does to the intercept value. The values for Intercept and ConditionSham are different, and less interpretable to me. I've been told that 'Session' is the slope for the Active condition, but I don't follow why that is the case?

Any help would be appreciated! Thanks.

Whether to specify a variable as numeric or factors requires an understanding of its function in the study. It should be a discussion between the scientist and the statistician.

In this particular case, I think these should all probably be factors. Specifying Session as numeric would be making the assumption there's a linear trend from Session 1,2,3 which may not be the case. Maybe Sessions 2 and 3 have the same effect, or maybe Session 2 has the highest effect ...only a factor variable could describe these scenarios.

1 Like

Thank you very much @arthur.t - that makes sense!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.