Needing help on Linear Mixed Models

Hi everyone,

I am a beginner in linear mixed models and would like some advice on what I would like to do with my data. I work in the field of cognitive neuroscience and my research focuses on understanding face processing in adults. We measure face processing using eye-tracking measures (here pupil dilation) and a paradigm using social (images and videos of real faces and avatars) and non-social (objects) stimuli.

I would explore if the physiological engagement, indexed by pupil diameter variations, is caused by the motion. To do that, we quantified the motion amount for each video by a coefficient. My data set consists of 7 variables (participant, movements, actors/actresses, categories, movement coefficient, and pupil dilation) with 1320 observations.

  • categories are broken down into 3 components: object (non-social stimulus), avatars, and real faces (social stimulus)
  • movements are broken down into 3 components: static, micro, and macro movement
  • actors/actresses: there are 4 videos (2 actors+2actresses) per category per movement
  • movement coefficient: one coefficient for each stimulus, as it depends on the actors and actresses. They are different people so the quantity of movement is not identical according to the movements (micro movement -> neutral expression; macro-movement -> neutral to happy and neutral to sad). When the movement category is static, the motion coefficient is 0 (only for avatars and real faces as it is static photography but for objects, they are videos with really small motion coefficients and smaller than micro and macro movement)

I thought a LMM would be an excellent analysis to answer my question, as I can consider the fixed effect and random effect but I am a bit lost about the model's writing... I tried random effect for a particular participant to allow the deviations of the intercept of that participant's pupil dilation from the population. In addition, I was thinking to add another random effect for a particular motion coefficient where the deviations in the ordinate of the pupil dilation of the motion coef in question from the total motion coefficient sample.

I have tried several LMMs but I don't know if the models I have tested are correct in writing about my research question: can motion predict pupil dilation? is the motion quantity influence pupil dilation?

modela <- lmer(pupil_dilation ~ categories*movements*actors + (1 | participants) + (1 | motion_coef), data = data, REML = FALSE)
modelb <- lmer(pupil_dilation ~ categories+movements+actors + categories:movements:actors + (1 | participants) + (1 | motion_coef), data = data, REML = FALSE)
modelc <- lmer(pupil_dilation ~ categories+movements+actors + (1 | participants) + (1 | motion_coef), data = data, REML = FALSE)
modeld <- lmer(pupil_dilation ~ categories + movements + (1 | participants) + (1 | motion_coef), data = data, REML = FALSE)
modele <- lmer(pupil_dilation ~ categories + (1 | participants) + (1 | motion_coef), data = data, REML = FALSE)

For your information: categories, movements, actors, motion_coef, and participants were converted as factors.

So few questions come to my mind:

  • Is LMM a good way to answer my question?
  • Do I have to normalize my data before starting my LMM?
  • Are the models above seem consistent according to my research question?

I hope I was clear about my description. Also, I am sorry if I didn't explain well about the LMM but as I am new I tried my best!

Thank you all in advance for your precious help!
Camille :slight_smile:

That's a thoughtful way to frame questions. Lacking [a reprex—see the FAQ] (FAQ: How to do a minimal reproducible example ( reprex ) for beginners), I'm only going to be offer some general thoughts.

First, though, is the movement coefficient continuous or has it been categorized into intervals?

Now, let's get philosophical and return to school algebra—f(x) = y where

x is your tabular data of dim 1320,7.
y is a transformation of x that abstracts away the detail to put a measure on the information contained in x
f is the function or functions (think f(g(x)) that does the transformation.

Although phrased as a question about linear mixed models, the heart of the analysis is the selection of y. What compact measure(s) best describe(s) the relationship between pupil diameter (the chosen measure for physiological engagement) and the combinations of stimuli that were presented? This question leads to

  1. Is there any relationship at all worth looking at or is it just random?

  2. If there is a relationship, how "close" or distant is it from randomness? The latter is the question that the misunderstood p-value of many statistical tests addresses. An f is applied to y to produce a test statistic, the p-value puts a measure on the probability that the statistic results simply from random variation. For the conventional default of 0.05, that means only a one in twenty chance. Depending on the phenomenon being gauged that may be it passes the laugh test and it's worth taking a closer look at or that's not good enough to trust the lives of millions to.

  3. If the data do show an associative relationship that passes the [pre]-selected confidence interval, do the data permit casual inference? Can the relationships among the multiple stimuli be teased apart to test for one while keeping the others constant? Are there mediators? Colliders? Stimuli that only have an effect indirectly? Both directly and indirectly? This is the domain of causal inference and in former times was heretical. Today we have tools, such as directed acyclic graphs, that make it possible when applied carefully.

The first question is the most general and easiest to overlook in the presence of eagerness to get to a conclusion. The tools of exploratory data analysis are designed to hold the drive to selecting y, the gauge of the outcome in abeyance to see what the data are capable of revealing.

You may have already done this and we should be looking at what metrics are available to address the case of a continuous outcome variable Y in the presence of X_1 \dots X_6 where the treatment variables are categorical. If you have, I'd encourage writing the EDA phase up in semi-formal fashion. From experience, I know that it's easy to let the spark of an idea die if left to fend for its own among a mass of mixed notes, files, scripts and what not.

The motivation for my question about whether motion coefficients were continuous is that looking at the simple case of two continuous variables is simple and potentially informative. Linear regression/ANOVA is unreasonably effective in assessing the threshold question is there any there there? If not, move on.

Are any of the variable ordinal in their categorization, such as movements? If these were measured as giraffe, watermelon and quartz there is no ordering, but static, micro and macro suggest a ranking. That may bear on the choice of model.

Also, a nonlinear mixed effects model may be more appropriate. Depending.

I don't have any experience going through this process and making these decisions. I can only just follow the {lme4} vignette. A long career chasing false hopes has left me cautious about setting sail before fully understanding the seas and winds to be expected.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.