How to manage R package dependencies for shiny app deployment (docker)

I'm not sure exactly what context you are working in, but this is one of the problems that RStudio Connect was designed to resolve. It solves the package dependency / reproducibility problem using packrat and the blue publishing button in the IDE to handle all of the "dirty work" for you (much like shinyapps.io if you are familiar). It does not use docker per se, but behaves in a very similar fashion sandboxing content / etc.

If you do want to attack the problem directly in docker, this article may be useful to your quest as it discusses (the surface of) many of these options. packrat is probably my personal favorite, but it definitely has its drawbacks (one of which is building all of the packages from source). Some other gotchas to beware of:

  • Operating System differences between your docker image and your host, especially if you are "mounting" the packages into the docker image and not installing them into the image directly
  • local package installation with packrat requires a CRAN-like repo. Alternatively, you can host the packages on git (packrat knows how to handle git installations from remotes::install_github or devtools::install_github. Again, if you are in the enterprise, RStudio Package Manager may be a good fit. Otherwise, the miniCRAN and drat packages can get you going

One of the main benefits packrat provides is the packrat/packrat.lock file, which allows you to "commit" / track your exact package dependencies at a particular moment in time (across local, GitHub, and CRAN packages). Some of the other semantics can get pretty messy to deal with (again, building packages from source can take a long time), but that packrat.lock file is gold :smiley:

You're definitely in a good place for this discussion, though! I know there have been several related discussions of reproducibility and docker on Community lately, so it might be worth searching for some of those!

2 Likes