Packrat bundles with different R versions - suggestions and alternatives

Hi all,

I am looking into packrat as a way of keeping things from breaking when updating / installing packages for a particular project. A typical situation would be that a new package needs to be installed but requires some dependencies to be upgraded / installed, which then will break existing code. It is particularly common with Biocondiuctorpackages. It is also an issue for us because quite some projects take years before completion, sometimes spanning several R / BioC versions.

What did I actually do? Bundle creation:

  • on my local computer (OSX, fresh install), installed R 3.6 and multiple R and BioC packages
  • Started a new project with packrat::init()
  • Created a snapshot with packrat::snapshot()
  • And finally a bundle with packrat::bundle(), which I think includes all package sources as well.

Project restoration:

  • On a server, R version 3.5.1 and CentOS Linux 7, I unbundle the bundle and got the following error
> packrat::unbundle("test_packrat-2019-06-11.tar.gz", "./")
- Untarring 'test_packrat-2019-06-11.tar.gz' in directory '/lustre/projects/bioinfo/domingue/scripts'...
- Restoring project library...
Installing BH (1.69.0-1) ... 
        OK (built source)
Installing BiocGenerics (0.30.0) ... 
Error: Command failed (1)

Failed to run system command:

        '/sw/apps/r/3.5.1/lib64/R/bin/R' --vanilla CMD INSTALL '/tmp/RtmpvUnPCU/BiocGenerics' --library='/lustre/projects/bioinfo/domingue/scripts/test_packrat/packrat/lib/x86_64-pc-linux-gnu/3.5.1' --install-tests --no-docs --no-multiarch --no-demo 

The command failed with output:
ERROR: this R is version 3.5.1, package 'BiocGenerics' requires R >=  3.6.0
In addition: Warning message:
In restore(project = getwd(), restart = FALSE) :
  The most recent snapshot was generated using R version 3.6.0

So as far as I can tell I am running into an issue, sort of unsurprisingly, of R / BioC packages not compiling in a different R version.

The questions are now:

  1. How to overcome this issue short of creating a docker container for every project?
  2. Is there a better way of keeping R workflows from breaking when R or it's packages are updated? I am thinking something like virtualenvs in python would work (python + packages for each project), but I am not sure something like this exists in R.

Regarding 2, I am guessing snapshots will tell me which version the packages where before updating, but I don't know if it would be easy to restore in case of an R update.

Cheers!

1 Like

You could be interesting by the next generation packrat called renv

Hope is it will handle better those cases. I think if you test it in your situation, your experience will be valuable.

There was just a new addition of renv::migrate to help transform an existing packrat project into renv project.

All this is still in dev but early feedbacks are always very useful !

1 Like

Thank you for the tip. I had heard of renv but somehow couldn't remember the name.

As for it's suitability, it doens't seem to be an improvment over packrat, at least for my use case. After installing it, devtools::install_github("rstudio/renv"), I ran renv::migrate() as suggested, copied the lock file to another computer, and ran renv::restore() and got the following error:

* Querying repositories for available source packages ... Done!
Retrieving 'https://bioconductor.org/packages/3.8/bioc/src/contrib/Archive/AnnotationDbi/AnnotationDbi_1.46.0.tar.gz' ...
curl: (22) The requested URL returned error: 404 Not Found
curl: (22) The requested URL returned error: 404 Not Found
Retrieving 'https://bioconductor.org/packages/3.8/data/annotation/src/contrib/Archive/AnnotationDbi/AnnotationDbi_1.46.0.tar.gz' ...
curl: (22) The requested URL returned error: 404 Not Found
curl: (22) The requested URL returned error: 404 Not Found
Retrieving 'https://bioconductor.org/packages/3.8/data/experiment/src/contrib/Archive/AnnotationDbi/AnnotationDbi_1.46.0.tar.gz' ...
curl: (22) The requested URL returned error: 404 Not Found
curl: (22) The requested URL returned error: 404 Not Found
Retrieving 'https://bioconductor.org/packages/3.8/workflows/src/contrib/Archive/AnnotationDbi/AnnotationDbi_1.46.0.tar.gz' ...
curl: (22) The requested URL returned error: 404 Not Found
curl: (22) The requested URL returned error: 404 Not Found
Retrieving 'https://cran.rstudio.com/src/contrib/Archive/AnnotationDbi/AnnotationDbi_1.46.0.tar.gz' ...
curl: (22) The requested URL returned error: 404 Not Found
curl: (22) The requested URL returned error: 404 Not Found
Retrieving 'https://cran.rstudio.com//src/contrib/Archive/AnnotationDbi/AnnotationDbi_1.46.0.tar.gz' ...
curl: (22) The requested URL returned error: 404 Not Found
curl: (22) The requested URL returned error: 404 Not Found
Error: failed to retrieve package 'AnnotationDbi' from CRAN
Traceback (most recent calls first):
  9: stop(sprintf(fmt, ...), call. = call.)
  8: stopf("failed to retrieve package '%s' from CRAN", record$Package)
  7: renv_retrieve_cran(record)
  6: renv_retrieve_bioconductor(record)
  5: renv_retrieve_impl(package)
  4: handler(package, renv_retrieve_impl(package))
  3: renv_retrieve(packages)
  2: renv_restore_run_actions(project, diff, current, lockfile)
  1: renv::restore()

Again the issue is mainly with Biocondutor packages. Reading the GH issues, it should have gracefully failed the installation and moved on to the next package, but that was not the case. No idea what it was looking for the package sources on that location either.

Cheers.

There could still be some improvement to make in renv to make it fully compatible with bioconductor.

I think your feedback can be useful if you can test it further.

Time allowing, I am happy to help @cderv. Just let me know how.