Multilevel model using lme4

Can anyone help me with my r studio report ?

You haven't said enough to allow an informed response....

Also bear in mind that if it relates to a course of study you must adhere to the homework policy.

1 Like

Hello,
I am trying to carry out a multilevel model using lmer to look at how the effect of (birthweight) changes the outcome (cognitive function) time whilst controlling for variables.

I have created a null model, then compared this null model with an effect of birthweight (in order to try asses whether there is a change over time.

Then I have conducted a model comparison to isolate the effect of overall change over time.

when I did this it came up with an error (as shown bellow)

Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
unable to evaluate scaled gradient
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge: degenerate Hessian with 1 negative eigenvalues

Thanks for the background information.

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

data set can be seen through this link: https://uoepsy.github.io/data/dapr3_2122_report1.csv.

The data set has information on participants from different hospitals who have had a stroke and is trying to asses whether their birthweight affects the participants change in cognitive function following the stoke. Cognitive function is assessed (straight after the stroke, 6,12,18 and 24 months following the stoke).

This is what iv done so far

library(tidyverse)
library(car)
library(Matrix)
library(lme4)
# Read in data set
data <- read_csv("https://uoepsy.github.io/data/dapr3_2122_report1.csv")
head(data)
# Remove N/A from data set 
newdata <- na.omit(data)

#LM of birthweight effect on ACER when control age + educ

lm <- lm(ACER ~ age + educ, data = newdata)
summary(lm)
# Cooks Distance
cooksd = cooks.distance(lm)
influential <- as.numeric(names(cooksd) [(cooksd > 4/694)])
cleandata <- newdata[-influential,]
# Null model
m.null <- lmer(ACER ~ 1 + (1 | pid), data=cleandata)
summary(m.null)

We can see the 0.35 / (0.35 + 6.54), of the total variance is attributable to participant-level variation.

#Now lets suppose we want to compare this null model with a model with an effect of birthweight(to assess whether there is overall change over time).

modA <- lmer(ACER ~ 1 + birthweight + (1 + birthweight | pid), data=cleandata)

modB <- lmer(ACER ~ 1 + birthweight + (1 | pid), data=cleandata)

isSingular(modA)
isSingular(modB)

#If we want to conduct a model comparison to isolate the effect of overall change over time (a fixed effect of birthweight), we might want to compare these two models:

m.base0 <- lmer(ACER ~ 1 + (1 + birthweight | pid), data=cleandata)

m.base <- lmer(ACER ~ 1 + birthweight + (1 + birthweight | pid), data=cleandata)

isSingular(m.base0)
isSingular(m.base)
# Straightforward LRT
anova(m.base0, m.base)
# parametric bootstrap LRT
library(pbkrtest)
PBmodcomp(m.base, m.base0)

I ran all your code as provided.
While some of your lmer model fits do give you singularity warnings, I dont see the error you listed.
If you rerun the code you have here in a fresh session, do you always get the error ?

inconsistent behaviour between two R users over the same code and data, often comes down to package version differences.
My package versions are:

 mpv <- Vectorize(packageVersion)
 mpv(c("lme4","tidyverse","Matrix","car","pbkrtest"))
$lme4
[1]  1  1 27  1

$tidyverse
[1] 1 3 1

$Matrix
[1] 1 3 4

$car
[1]  3  0 11

$pbkrtest
[1] 0 5 1
1 Like

Hi,
Yes I still get the error come up after the parametric bootstrap LRT.
Even when I copy in the package version you have the error still appears.

Also is there a way to control for age and education with the code that I have done?
I tried to do a new model to control for age an education:

mod_1 <- lmer(ACER ~ 1 + age + educ + birthweight (1 | hospital/pid), data = cleandata)

however it comes up with an error:
Error in birthweight(1 + pid) : could not find function "birthweight"

If I take out birthweight it says there's and error with educ and age, even though I have copied the names correctly from the data frame?

I dont understand this statement.
simply run my code that lists the versions of the packages that you have installed. and we can compare to see if your version numbers are equal to , or different than mine. if they are equal, then there is a mystery, if they are different, it will tell what should be adjusted.

you are missing some symbol between birthweight and the (1 | hospital/id)
perhaps a + symbol

This is what my packages say.

and thanks I have added a + and that model has worked now

that output only tells me what tidyverse package versions are.
This does not show lme4 etc.
If you would like additional support. please run the following code and share the output

 mpv <- Vectorize(packageVersion)
 mpv(c("lme4","tidyverse","Matrix","car","pbkrtest"))
 mpv <- Vectorize(packageVersion)
 mpv(c("lme4","tidyverse","Matrix","car","pbkrtest"))

the output:

$lme4
[1] 1 1 27 1

$tidyverse
[1] 1 3 1

$Matrix
[1] 1 3 4

$car
[1] 3 0 11

$pbkrtest
[1] 0 5 1

How can get rid of the warning:
boundary (singular) fit: see ?isSingular

m.base <- lmer(ACER ~ 1 + birthweight + (1 + birthweight | pid), data=cleandata)
isSingular(m.base)

I have tried to remove the random intercept by setting it to 0 however that has not made a difference.

m.base <- lmer(ACER ~ 1 + birthweight + (0 + birthweight | pid), data=cleandata)
is still comes up with:
isSingluar(m.base)- TRUE

maybe ?

ACER ~ birthweight + (1 | pid)

I am still getting the warning on the parametric bootstrap LRT do you think its down to my version- I am using it on the cloud because I had problems downloading it to my laptop?

We have same package Versions so I expect the same results between us if we use both the same data and the same code. That said its not clear to me exactly which 2 models you are trying to compate using parametric bootstrap LRT....

It at least works for me on the m.base and m.null


library(lme4)
library(pbkrtest)
library(readr)
# Read in data set
data <- read_csv("https://uoepsy.github.io/data/dapr3_2122_report1.csv")
head(data)

newdata <- na.omit(data)
#LM of birthweight effect on ACER when control age + educ

lm <- lm(ACER ~ age + educ, data = newdata)
summary(lm)
# Cooks Distance
cooksd = cooks.distance(lm)
influential <- as.numeric(names(cooksd) [(cooksd > 4/694)])
cleandata <- newdata[-influential,]
m.null <- lmer(ACER ~ 1 + (1 | pid), data=cleandata)

isSingular(m.null)
m.base <- lmer(ACER ~ birthweight + (1 | pid), data=cleandata)

isSingular(m.base)
# parametric bootstrap LRT

PBmodcomp(m.base, m.null)

results in

Bootstrap test; time: 19.81 sec; samples: 1000; extremes: 0;
large : ACER ~ birthweight + (1 | pid)
ACER ~ 1 + (1 | pid)
         stat df   p.value    
LRT    37.884  1 7.507e-10 ***
PBtest 37.884     0.000999 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Do you think the model that I have sent is a good way to asses the affect that birthweight has on the change in cognitive function ?

I don't know, I would assume that it would take some scientific knowledge in pediatric science to say with any confidence. I don't even know what 'ACER' is supposed to represent.

I wonder about the presence of a time variable in your dataset, which suggest this is some form of panel data, I don't think your analysis takes account of that, and I would have guessed that it perhaps should.

Also, what is 'pid' , if we are assuming its an ID then it wouldnt seem to be a unique one as it matches across hospital. Are the same individuals going to different hospitals, or does each hospital assign ID's sequentially starting from 1 ?
if you intended to model with pid assuming it were a unique ID then you would have to take the hospital into account, perhaps by combining them.

I think its very hard to do data science without a thorough understanding of the data ... so sorry If I can't tell you anything concrete

no thank you


The data is available at: https://uoepsy.github.io/data/dapr3_2122_report1.csv.

Study Aims
The data come from a study investigating factors that predict recovery in cognitive functioning following stroke. It has been hypothesised that changes in cognitive functioning following a stroke might be influenced by the birthweight of the stroke patient. The aim of the research is to assess this hypothesis. 12 hospitals took part in the study, recruiting a total of 204 patients who had suffered a stroke following up at 6 month intervals. At every assessment – baseline (immediately following stroke), 6 months-, 1 year-, 18 months-, and 2 years- post-stroke – researchers administered Addenbrooke’s Cognitive Examination-Revised (ACE-R) to examine changes in participants’ cognitive functioning following stroke. Data was also collected on two additional known risk factors for levels of cognitive functioning: age (measured at baseline) and years of education.

Data Dictionary

|hospital|- Hospital Identifier
|pid|- Participant Number
|timepoint|- Visit Number (1-5)
|ACER|- Score on Addenbrooke’s Cognitive Examination-Revised (ACE-R). Range 0-100, Scores below 82 indicate dementia
|months|- Months since stroke
|birthweight|- Birthweight (grams)
|age|- Age at baseline (years)
|educ|- Years of education

this is all the information I have what would you recommend to do?