Improving reproducibility for Rmarkdown-based manuscript

I've tagged this as RMarkdown, but its more of a general question...

If I want to make a reproducible manuscript (written in RMarkdown) within an R project easy to re-render by another researcher (using data and code for a manuscript), what would be the pro's and con's of the following two approaches for package management:

  • use renv::snapshop() to include a lock file in the project (uploaded to OSF) and link to the renv() vignettes in a readme file.
  • use a DESCRIPTION file and let users know they can install the needed packages with devtools::install_deps() in a readme file.

For context, I expect that most readers would download a .zip file from OSF if they wanted to play with the code and data.

I've had some challenges with renv() in the past and I'm concerned it may create issues for potential users of the code/data who may not know anything about renv() - so I was considering that the DESCRIPTION file approach might create fewer issues. Would including the renv() snapshot create issues for users who don't want to use renv?

thanks!

I'll try to give generic solutions to this problem but usually this will be a personal preference choice. I believe there are also other tools that help with this around the R ecosystem thand renv

First, main difference between DESCRIPTION approach and renv lockfile is that the former will only record main package dependencies without their version, while the latter will allow to record every package in the dependency tree with their version (recorded during snapshot). The latter insure better reproducibility in the future as sometime package can be archived, or just evolve and break a past analysis.

However, renv works best in a project structure. How to collaborate in that case is described in https://pkgs.rstudio.com/renv/articles/collaborating.html. Ideally you would share the Rmd as a project with a .Rprofile and a renv.lockfile. The .Rprofile should take care of installin renv and setting up the project library for the user and then only running renv::restore() would reinstall everything needed. So not much to know about renv in this case.

Another solution is using renv::use which is quite recent (Using Use • renv). This allows to record dependencies in a rscript one could execture to reinstall. It could even work from a Code Chunk in a Rmd document I believe.

Some other tools I know can be useful like capsule (GitHub - MilesMcBain/capsule: An inversion of renv for low effort reproducible R package libraries) which tries to bring a simplified version of using renv for collaboration. It is build on renv but

you need to know about to use capsule : capsule::create() , capsule::run() , capsule::recreate() .

Hope it helps

1 Like

Wow thanks so much. This is really helpful.

I settled on uploading the full RProject with renv.lock file and including basic instructions for renv in the the project readme.

I think you're right that this is a pretty reasonable approach.

Thanks for taking the time to respond

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.