# How to run a multilevel regression model with crossed random effects: items and participants?

I am trying to run a multilevel regression for my study:
I have two random effects; participants (97) and items (which are the 20 words used in the study)
Each participant had to spell the same words.

My outcome variable is spelling accuracy and has 2 levels- 1 for correct, and 0 for incorrect.

My predictor variables are all continuous, these are word features and include word length, word frequency, OLD20(neighbourhood density), and OLDF(neighbourhood frequency)

I want to use the raw spelling scores in a regression model without aggregating them first which is why I need to use a multilevel regression model.

I'm trying to figure out the correct code to use but haven't had any luck so far. This is what I've got:

``````M1 <-lmer(spelling_accuracy ~ 1 + OLD20 + OLDF + wordlength_letter + word_frequency + (1|items) + (1|participants), data = combined_df, REML = FALSE)
``````

This gives me the following warning:

``````fixed-effect model matrix is rank deficient so dropping 18 columns / coefficients
boundary (singular) fit: see help('isSingular')
``````

Then when I check the model by running:

``````library(jtools)
summ(M1)
``````

I get the following:

``````MODEL INFO:
Observations: 1940
Dependent Variable: spelling_accuracy
Type: Mixed effects linear regression

MODEL FIT:
AIC = 1658.29, BIC = 1786.41
Pseudo-R² (fixed effects) = 0.14
Pseudo-R² (total) = 0.53

FIXED EFFECTS:
----------------------------------------------------------
Est.   S.E.   t val.      d.f.      p
----------------- ------- ------ -------- --------- ------
(Intercept)          0.52   0.05    11.01    397.48   0.00
OLD201.7             0.19   0.05     3.79   1843.00   0.00
OLD201.85            0.26   0.05     5.26   1843.00   0.00
OLD201.9             0.14   0.05     2.94   1843.00   0.00
OLD201.95            0.22   0.05     4.42   1843.00   0.00
OLD202.0             0.12   0.05     2.52   1843.00   0.01
OLD202.25           -0.40   0.05    -8.20   1843.00   0.00
OLD202.35           -0.01   0.05    -0.21   1843.00   0.83
OLD202.45            0.07   0.05     1.47   1843.00   0.14
OLD202.5            -0.01   0.05    -0.21   1843.00   0.83
OLD202.65            0.16   0.05     3.37   1843.00   0.00
OLD202.7            -0.02   0.05    -0.42   1843.00   0.67
OLD202.9            -0.06   0.05    -1.26   1843.00   0.21
OLD203.0             0.20   0.05     4.00   1843.00   0.00
OLD203.05           -0.44   0.05    -9.05   1843.00   0.00
OLD203.35           -0.02   0.05    -0.42   1843.00   0.67
OLD203.4            -0.16   0.05    -3.37   1843.00   0.00
OLD203.5             0.15   0.05     3.16   1843.00   0.00
OLDF12.7            -0.21   0.05    -4.21   1843.00   0.00
OLDF4.6              0.04   0.05     0.84   1843.00   0.40
----------------------------------------------------------

p values calculated using Satterthwaite d.f.

RANDOM EFFECTS:
----------------------------------------
Group        Parameter    Std. Dev.
-------------- ------------- -----------
participants   (Intercept)     0.31
items       (Intercept)     0.00
Residual                     0.34
----------------------------------------

Grouping variables:
--------------------------------
Group       # groups   ICC
-------------- ---------- ------
participants      97      0.45
items          20      0.00
--------------------------------
``````

I'm not sure why it's showing only OLD20 in the model outcome, and why it's presenting it as different levels. I have made sure to code the variables correctly:

``````as.numeric(combined_df\$OLD20)
as.numeric(combined_df\$OLDF)
as.factor(combined_df\$spelling_accuracy)
``````

Any help with fixing my code for the model would be really appreciated.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.