Fitting lmer() model help

Hello,

I am working on a project investigating how certain variables impact the effectiveness of mental health interventions. I ran a RCT with two interventions. DV's were measured pre and post intervention and IV's were measured once pre intervention.
DV's: Anxiety, Positive Affect, Negative Affect
IV's: Previous positive experience with intervention, Attitude towards intervention, Expectancy, Credibility, and Trait anxiety
I am using a linear mixed effect model to test this.

anxiety_model <- lmer(Anxiety2 ~ Intervention + Positive_Experience + Credibility
                      + Expectancy + Attitude + Trait_Anxiety + Anxiety1 + (1|PPT), DATA)

positive_model <- lmer(Positive_Affect_2 ~ Intervention + Positive_Experience + Credibility
                      + Expectancy + Attitude + Trait_Anxiety + Positive_Affect_1 + (1|PPT), DATA)

negative_model <- lmer(Negative_Affect_2 ~ Intervention + Positive_Experience + Credibility
                      + Expectancy + Attitude + Trait_Anxiety + Negative_Affect_1 + (1|PPT), DATA)

Negative_Affect_2 is negative affect measured after the intervention (DV), Negative_Affect_1 is the negative affect measured before the intervention. The same applies for anxiety and positive affect.

I have 68 participants in intervention 1 and 53 in intervention 2. I have left NAs in as I have read that the lmer() can handle missing data.

Intervention is a factor variable with levels 1 and 2. PPT (participant number) is a character variable.

I have never ran this type of model before, I am just posting to seek some help with this model as I am unfamiliar with it and want to understand where I am going wrong.

Questions:

  1. are my models fitted correctly?
  2. I keep getting warning messages concerning eigenvalues, can any give me advice on these?
Warning message:
In as_lmerModLT(model, devfun) :
  Model may not have converged with 1 eigenvalue close to zero: -6.9e-09

Warning message:
Model failed to converge with 1 negative eigenvalue: -4.1e-08 

This is my first post and I am relatively new to R so I'm sorry if this is a confusing post. Please feel free to ask questions about my coding and I will answer them.

Thank you,
Karen :slight_smile:

1 Like

Hi Karen,

I'm not an expert in your field, but I have a lot of experience fitting lmer models in ecology. From what you've described, I believe your models are correctly specified.

The warning messages you are getting suggest your models are not fitting correctly -- they indicate an issue with the design matrix (the matrix of IVs for each participant) in the model. Typically you will get these warnings when one or more columns in the design matrix are very nearly identical to a multiple of another column. Another name for this problem is collinearity. For example, if Trait_Anxiety and Anxiety1 are highly correlated, then you would get this warning.

One way to check is to look at the correlation matrix for your IVs. Packages corrplot and corrr will help.
Variance Inflation Factors (VIF) can be more sensitive to problems like this, check function car::vif()

Alternatively, you can start with a simpler model like

anxiety_model <- lmer(Anxiety2 ~ Intervention + (1|PPT), DATA)

and see if the warnings go away. If so, then add your other IVs back in one at a time and see when the warnings turn up again. If the warnings don't go away with that model, then there is some deeper issue with the data structure.

I know you probably can't post the actual data because of confidentiality reasons, but if you're still stuck after trying the above, you might try creating a dataset that has the same properties as yours but just has X1, X2, ... etc for IVs. I've never tried it, but package synthpop might help you here, and there are a few other strategies if you google "anonymize data r". If you center and scale all the IVs, and change their names, they should still cause you problems but the scale is meaningless unless you know the scaling constants, which you wouldn't share. Then you could share the anonymized data and it would be easier to diagnose more difficult issues.

Good luck!

2 Likes

Hi Drew,

Thank you for your response.
I have now looked into the correlation matrix for my IV's and investigated the VIF. This returned nothing abnormal so I believe it must be a deeper issue with the data structure.

Is the data required to be in long format? It was in wide format before but I have changed it to long format now using the melt() function.

DATAlong<-melt(DATA, id.vars=c("PPT","Intervention"))

Now I have done that, the model is returning "Error in eval(predvars, data, env) : object'Anxiety2' not found.

I am currently working on anonymising the data.

Thanks again for your help!

The data structure should have one row per observation, and all of the variables should be present as columns -- so I think your data structure was fine.

Did you try the simpler model?

Yes, tried the simpler model

anxiety_model<- lmer(Anxiety2 ~ Intervention + (1|PPT), DATA

and this warning message appeared...

Warning message:
Model failed to converge with 1 negative eigenvalue: -2.0e-07

Can you post the summary of that model? might be something useful in there.

It would work better if you attach it as a CSV file. Also, something is a bit off because the PPT column has only two distinct values.

By summary of the model I meant
summary(anxiety_model)

Hi Drew,

Sorry, this is probably really obvious but I do not know how to attach a .csv file as the only options I can see available are photo images or pdf's.

Please see summary of anxiety model below

> anxiety_model<- lmer(Anxiety2 ~ Intervention + Positive_Experience  +  Credibility
+                      + Expectancy + Attitude + Trait_Anxiety + Anxiety1 + (1|PPT) , DATA)
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.0425793 (tol = 0.002, component 1)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?

> summary(anxiety_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Anxiety2 ~ Intervention + Positive_Experience + Credibility +      Expectancy + Attitude + Trait_Anxiety + Anxiety1 + (1 | PPT)
   Data: DATA

REML criterion at convergence: 459.7

Scaled residuals: 
       Min         1Q     Median         3Q        Max 
-4.001e-06 -7.951e-07 -7.110e-08  8.578e-07  3.864e-06 

Random effects:
 Groups   Name        Variance  Std.Dev. 
 PPT      (Intercept) 1.636e+01 4.044e+00
 Residual             3.302e-11 5.746e-06
Number of obs: 98, groups:  PPT, 94

Fixed effects:
                      Estimate Std. Error         df t value Pr(>|t|)
(Intercept)         17.2989161  8.2550947  0.0054003   2.096    0.978
Intervention        -1.6009057  1.6031778  0.0458100  -0.999    0.903
Positive_Experience -0.0034879  0.0637182  0.0066724  -0.055    0.996
Credibility         -0.1336464  0.1208873  0.0560781  -1.106    0.883
Expectancy           0.0154761  0.0125173  0.0160547   1.236    0.953
Attitude             0.0782994  0.0727864  0.0005378   1.076    0.998
Trait_Anxiety        0.0219922  0.0974992  0.0847401   0.226    0.943
Anxiety1             0.5589693  0.0997523  0.0022963   5.604    0.988

Correlation of Fixed Effects:
            (Intr) Intrvn Pstv_E Crdblt Expctn Attitd Trt_An
Interventin -0.640                                          
Pstv_Exprnc -0.533  0.734                                   
Credibility -0.091  0.255 -0.090                            
Expectancy   0.332 -0.215 -0.045 -0.438                     
Attitude    -0.406  0.058 -0.077 -0.388 -0.224              
Trait_Anxty -0.493  0.070  0.061  0.011 -0.195  0.142       
Anxiety1    -0.353  0.072  0.161  0.039 -0.106 -0.123 -0.307
convergence code: 0
Model failed to converge with max|grad| = 0.0425793 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?

P.S the variable PPT has been sorted, the two distinct values were an accident I created by playing around with the formating

If you have a small anonymised data frame you wish to share. You can use the base function

dput()

This will cause a copy and pastable representation of your data frame that you can share as text

Thank you, however my dataset is 121 obs of 13 variables so too large to copy and paste as text

4 posts were split to a new topic: Can I share a dataset (csv, rdata, etc) on RStudio Community by renaming the filetype?

Hi @KarenMillin, 0.04 is quite off the targeted gradient of 0.002.
So indeed, the estimates may not be trustworthy.

Risking to state the obvious but it is not clear from your post: did you try to do what the warning says, i.e. rescaling the variables?

That may greatly help. You can use scale() for that, but then you must interpret each coefficient as the change triggered by an increase in one standard deviation of the focal covariate.

Hope this help.

Hi Alexandre,

Thank you for your response.
I have scaled but I have possibly not done it correctly?
I used the scale function on the numeric values in the dataset (this excluded PPT and Intervention)

ind<-sapply(DATA, is.numeric)
DATA[ind] <- lapply(DATA[ind], scale)

and similar warnings come up with or without the scales variables

Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 0.050911 (tol = 0.002, component 1)
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?
3: In as_lmerModLT(model, devfun) :
  Model may not have converged with 1 eigenvalue close to zero: 5.7e-11

I have also used this coding to try scaling incase it was my coding which was the problem

NumDATA<-DATA[,c(3:13)]
scaled<-scale(NumDATA)
PPT_and_Intervention<-DATA[,c(1,2)]
DATASCALED<-cbind(PPT_and_Intervention,scaled)

and a similar warning comes up

Warning messages:
1: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp),  :
  convergence code -4 from nloptwrap
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  unable to evaluate scaled gradient
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 1 negative eigenvalues
4: In as_lmerModLT(model, devfun) :
  Model may not have converged with 1 eigenvalue close to zero: 4.8e-09

What you have done for scaling the variables seem OK but you should take the habit to check.
Here the check would consist in testing that means are 0 and sd 1.
To do the whole thing, I would perhaps use {dplyr}, particularly if you are unsure of your *apply() calls.
Here is an example using the iris datset:

library(dplyr)

# scalling:
iris %>%
  mutate_if(is.numeric, scale) -> scaled_iris

# check if all means are 0
scaled_iris %>%
  summarise_if(is.numeric, ~ near(mean(.x), y = 0))
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1         TRUE        TRUE         TRUE        TRUE

# check if all sd are 1
scaled_iris %>%
  summarise_if(is.numeric, ~ near(sd(.x), y = 1))
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 1         TRUE        TRUE         TRUE        TRUE

but indeed this does not seem to solve your problem...
Could you show the model summary of with the scaled variables?

Ah this might be where things are going wrong...

> DATASCALED %>%
+   summarise_if(is.numeric, ~ near(mean(.x), y = 0))
  Intervention Positive_Experience Credibility Expectancy Attitude Trait_Anxiety Anxiety1 Positive_Affect_1
1        FALSE                  NA          NA         NA       NA          TRUE     TRUE              TRUE
  Negative_Affect_1 Anxiety2 Positive_Affect_2 Negative_Affect_2
1              TRUE       NA                NA                NA
> DATASCALED %>%
+   summarise_if(is.numeric, ~ near(sd(.x), y = 1))
  Intervention Positive_Experience Credibility Expectancy Attitude Trait_Anxiety Anxiety1 Positive_Affect_1
1        FALSE                  NA          NA         NA       NA          TRUE     TRUE              TRUE
  Negative_Affect_1 Anxiety2 Positive_Affect_2 Negative_Affect_2
1              TRUE       NA                NA                NA

The missing means are likely due to missing values in the data. The mean function will return NA if at least one value is missing, unless the na.rm argument of mean is set to TRUE (it is FALSE by default). Looking at the summary of your data you posted earlier, the NA values above occurred for those columns with at least one missing value. Try changing mean(.x) to mean(.x, na.rm=TRUE) in your code.

Also, I'm not sure how Intervention is appearing at all in the output if it's coded as a factor. It looks like it must be numeric if it's passing the is.numeric test in summarise_if. I'm also not clear why Intervention isn't being scaled properly by the scale function. For example, if I do x=rep(1:2, c(68,53)) and then mean(x); sd(x) I get the expected result.

Regarding the convergence warnings, these troubleshooting tips might be helpful.

It would definitely help to have a sample of your real data. Could you paste into a reply the output of, say, dput(DATA[1:10, ]) and dput(DATASCALED[1:10, ]).

2 Likes
> DATASCALED %>%
+   summarise_if(is.numeric, ~ near(mean(.x, na.rm=T), y = 0))
  Positive_Experience Credibility Expectancy Attitude Trait_Anxiety Anxiety1 Positive_Affect_1
1                TRUE        TRUE       TRUE     TRUE          TRUE     TRUE              TRUE
  Negative_Affect_1 Anxiety2 Positive_Affect_2 Negative_Affect_2
1              TRUE     TRUE              TRUE              TRUE
> DATASCALED %>%
+   summarise_if(is.numeric, ~ near(sd(.x, na.rm = T), y = 1))
  Positive_Experience Credibility Expectancy Attitude Trait_Anxiety Anxiety1 Positive_Affect_1
1                TRUE        TRUE       TRUE     TRUE          TRUE     TRUE              TRUE
  Negative_Affect_1 Anxiety2 Positive_Affect_2 Negative_Affect_2
1              TRUE     TRUE              TRUE              TRUE

Thank you so far! This is the data scaled correctly using na.rm=T

Hi Joel,
Here is a sample, I have excluded PPT for confidentiality

> dput(DATA[1:10,-1 ])
structure(list(Intervention = structure(c(2L, 2L, 1L, 1L, 2L, 
1L, 1L, 2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), 
    Positive_Experience = c(28L, 25L, 48L, 44L, 24L, 56L, 44L, 
    27L, 54L, 24L), Credibility = c(14L, 15L, 28L, 18L, 19L, 
    26L, 27L, 17L, 19L, 10L), Expectancy = c(67L, 130L, 148L, 
    62L, 41L, 148L, 126L, 92L, 116L, 47L), Attitude = c(58L, 
    52L, 59L, 57L, 62L, 69L, NA, 68L, NA, NA), Trait_Anxiety = c(50L, 
    49L, 48L, 55L, 47L, 51L, 34L, 38L, 43L, 37L), Anxiety1 = c(46L, 
    45L, 40L, 43L, 39L, 41L, 28L, 45L, 39L, 40L), Positive_Affect_1 = c(20L, 
    27L, 24L, 29L, 15L, 36L, 13L, 26L, 29L, 15L), Negative_Affect_1 = c(13L, 
    14L, 23L, 20L, 30L, 13L, 15L, 23L, 10L, 12L), Anxiety2 = c(40L, 
    46L, 36L, 51L, 40L, 42L, 33L, 47L, 40L, 37L), Positive_Affect_2 = c(19L, 
    30L, 26L, 33L, 14L, 42L, 20L, 28L, 30L, 19L), Negative_Affect_2 = c(11L, 
    11L, 12L, 18L, 28L, 10L, 10L, 17L, 10L, 10L)), row.names = c(NA, 
10L), class = "data.frame")
> dput(DATASCALED[1:10,-1 ])
structure(list(Intervention = structure(c(2L, 2L, 1L, 1L, 2L, 
1L, 1L, 2L, 1L, 2L), .Label = c("1", "2"), class = "factor"), 
    Positive_Experience = c(-0.701604116517139, -0.939112124832279, 
    0.881782605583798, 0.56510526116361, -1.01828146093733, 1.51513729442417, 
    0.56510526116361, -0.780773452622186, 1.35679862221408, -1.01828146093733
    ), Credibility = c(-1.16909004675983, -1.00167858421666, 
    1.17467042884461, -0.499444196587137, -0.332032734043962, 
    0.839847503758257, 1.00725896630143, -0.666855659130311, 
    -0.332032734043962, -1.83873589693253), Expectancy = c(-0.926956916586304, 
    0.424855253435389, 0.811087302013016, -1.03424359674676, 
    -1.48484765342065, 0.811087302013016, 0.339025909307028, 
    -0.390523515784045, 0.124452548986124, -1.35610363722811), 
    Attitude = c(-0.601886922263556, -1.32415122897982, -0.481509537810845, 
    -0.722264306716268, -0.120377384452711, 0.722264306716268, 
    NA, 0.601886922263556, NA, NA), Trait_Anxiety = c(0.40727588473911, 
    0.217004139359062, 0.0267323939790145, 1.35863461163935, 
    -0.163539351401033, 0.597547630119158, -2.63707204134166, 
    -1.87598505982146, -0.924626332921225, -2.06625680520151), 
    Anxiety1 = c(0.360164280390108, 0.148611474918249, -0.909152552441049, 
    -0.27449413602547, -1.12070535791291, -0.697599746969189, 
    -3.44778621810336, 0.148611474918249, -1.12070535791291, 
    -0.909152552441049), Positive_Affect_1 = c(-1.23988545925122, 
    -0.22226628872236, -0.658388790377586, 0.0684820457144573, 
    -1.96675629534327, 1.08610121624332, -2.25750462978008, -0.367640455940769, 
    0.0684820457144573, -1.96675629534327), Negative_Affect_1 = c(-0.69514055812505, 
    -0.525559897775995, 1.00066604536549, 0.491924064318332, 
    2.18773066780888, -0.69514055812505, -0.355979237426941, 
    1.00066604536549, -1.20388253917221, -0.864721218474104), 
    Anxiety2 = c(-0.896585529996111, 0.32415015315244, -1.71040931876181, 
    1.34142988910957, -0.896585529996111, -0.48967363561326, 
    -2.32077716033609, 0.527606100343866, -0.896585529996111, 
    -1.50695337157039), Positive_Affect_2 = c(-1.4696357765789, 
    -0.0873261838257026, -0.589984217554138, 0.289667341470624, 
    -2.09795831873944, 1.4206479173596, -1.34397126814679, -0.33865520068992, 
    -0.0873261838257026, -1.4696357765789), Negative_Affect_2 = c(-0.575431702367056, 
    -0.575431702367056, -0.387860042479452, 0.737569916846171, 
    2.61328651572221, -0.763003362254659, -0.763003362254659, 
    0.549998256958567, -0.763003362254659, -0.763003362254659
    )), row.names = c(NA, 10L), class = "data.frame")
> 

Don't forget to do this please:

Sorry Alexandre, here it is now

> summary(anxiety_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Anxiety2 ~ Intervention + Positive_Experience + Credibility +  
    Expectancy + Attitude + Trait_Anxiety + Anxiety1 + (1 | PPT)
   Data: DATASCALED

REML criterion at convergence: 131.9

Scaled residuals: 
       Min         1Q     Median         3Q        Max 
-6.345e-07 -1.271e-07 -7.250e-09  1.407e-07  6.334e-07 

Random effects:
 Groups   Name        Variance  Std.Dev. 
 PPT      (Intercept) 6.772e-01 8.229e-01
 Residual             3.664e-14 1.914e-07
Number of obs: 98, groups:  PPT, 94

Fixed effects:
                     Estimate Std. Error        df t value Pr(>|t|)
(Intercept)          0.141053   0.154859  0.101408   0.911    0.839
Intervention2       -0.279668   0.310413  0.101408  -0.901    0.839
Positive_Experience  0.004366   0.159561  0.101408   0.027    0.992
Credibility         -0.157852   0.146184  0.101408  -1.080    0.825
Expectancy           0.135684   0.117723  0.101408   1.153    0.820
Attitude             0.147302   0.121889  0.101408   1.208    0.816
Trait_Anxiety        0.026961   0.098053  0.101408   0.275    0.927
Anxiety1             0.527898   0.094465  0.101408   5.588    0.700

Correlation of Fixed Effects:
            (Intr) Intrv2 Pstv_E Crdblt Expctn Attitd Trt_An
Interventn2 -0.840                                          
Pstv_Exprnc -0.618  0.717                                   
Credibility -0.213  0.257 -0.094                            
Expectancy   0.156 -0.183 -0.010 -0.435                     
Attitude     0.014  0.007 -0.133 -0.398 -0.209              
Trait_Anxty -0.059  0.009  0.003  0.010 -0.161  0.102       
Anxiety1    -0.050  0.067  0.163  0.058 -0.085 -0.184 -0.362
convergence code: 0
Model failed to converge with max|grad| = 0.0247718 (tol = 0.002, component 1)
Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?