mediation with MICE multiple imputation

I am conducting a mediation analyses in R with incomplete data for my master's thesis. For the missings I use MICE multiple imputation according to van Buuren & Groothuis-Oudshoorn (2011).

Tingley et al. (2014; see: wrote about mediation and missing data:

"The mediation package includes a pair of utility functions – mediations and amelidiate – to facilitate such analysis. First, users simulate multiple data sets using their preferred imputation software. Next, run mediate on each data set by simply passing the data sets through mediations. Next, pass the output of mediations to the amelidiate function, which combines the components of the output from mediations into a format that can be analyzed with the standard summary and plot commands."

Unfortunately I was not able to write a correct R code. I have the following three variables:

Predictor: number of completed modules (ModuleEr)

Mediator: therapeutic alliance with eCoach (WAI_P)

Outcome: Severity of depression (QIDS_t1)

Suppose I do a simple imputation, how do I proceed with the mediation?

I was thinking of something like that:

df <- data



imp <- mice(df, m=5, seed = 1234)

mod1 <- with(data = imp, exp = lm(WAI_P ~ ModuleEr))

res1 <- pool(mod1)


mod2 <- with(data = imp, exp = lm(QIDS_t1 ~ ModuleEr + WAI_P))

res2 <- pool(mod2)


For mediation I was thinking of:

a) Med <- mediate(mod1, mod2, treat="QIDS_t1“, mediator="WAI_P", sims = 100)

Or b ). Med <- with(data = imp, exp = mediate(mod1, mod2, treat="QIDS_t1“, mediator="WAI_P", sims = 100))

But both, a) and b) showed following warning:

Error in formula.default(object, env = baseenv()) : invalid formula

I would be very happy if someone could share an R code with me or correct mine!

Tanks a lot!


If you like us to be able to take a deeper look at the issue, we need to be able to run your code. At this point, there is no data we can work with. I suggest you create a reprex because this will help us a lot. A reprex consists of the minimal code and data needed to recreate the issue/question you're having. You can find instructions how to build and share one here:

If your data is private, large or sensitive, you should create a small data frame that generates the same error, though the result themselves could be complete nonsense (it's the error we care about at the moment) .

Hope this helps,

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.