Are packrat bundles really portable across different platforms?


#1

I have a packrat bundle built on a MacOSX box.
Is this bundle portable across different platforms? Windows, for example.
I am organising a course where each student will use his own machine.
I thought that would be good to use packrat to build and deploy the environment and make sure that it is unique.

In packrat home page I've found that it is portable, but I'd like to know if someone else had any previous experience deploy packrat bundles across different platforms.
I should also mention that I plan to use packages that have native libraries dependencies, like dplyr, for example.


#2

When you say packrat bundle, I will presume you are talking about the result of packrat::bundle().

The important thing to note about this function is that it has a lot of arguments. As a result, what your bundle will look like will depend on your selection of arguments.

To answer your bigger question, though, packrat is definitely a great solution for portability and reproducibility of projects across different operating systems. I have experience using packrat to reproduce the same project on Windows and on Linux.

An important caveat to be aware of, though, packrat manages several different things. It will be profitable to have a good understanding of what these are so that you can understand what is happening.

  • the version of packages you have in your project. This is incredibly necessary and is stored in the packrat/packrat.lock file
  • the options you have selected for packrat's behavior. This is stored in the packrat/packrat.opts file
  • an "auto-executable" in your project. Typically stored in .Rprofile. This automatically initializes packrat when the project is opened
  • the package sources. These are stored on CRAN as well, so no worries if you lose this part of the repository. Typically stored at packrat/src. This is the operating-system agnostic source code for packages.
  • package binaries. These are compiled for each operating system and are stored in sub-directories like packrat/lib.

All you really need to make your project reproducible on another operating system is packrat.lock. If you have this file, a student can run packrat::restore() to download package source files from CRAN and then build the packages on her own computer.

However, packrat.opts can be useful, and packrat::bundle() has options for including the other items. Including the sources in your bundle (the default) will save students the time of downloading source files from CRAN when loading up the project. Up until this point, though, all of the options I have mentioned require the student being able to build packages from source. This requires things like a C++ compiler, but I think RStudio includes most of these tools in modern versions.

To avoid building packages from source, the only option you really have in packrat world is packrat::bundle(include.lib=TRUE). To avoid building packages from source, the important caveat here will be that you and the students all have the same operating system. You will build binaries before bundling, and then include the binaries in the bundle. If the student has the same type of operating system, they should be able to open the bundle and use the binaries.

Hopefully that is a helpful overview :slight_smile: I generally prefer to track packrat.lock and packrat.opts in version control (git) and exclude all of the other folders (lib, src, etc.) from version control. That typically gives me what I need to reproduce the project on another computer (even w/ another OS) using packrat::restore(). However, packrat::bundle() is certainly a valid approach! I would recommend testing with a clean computer analogous to a student's (maybe a new AWS VM... it starts w/ a free tier to save $$) so you can get a sense for which workflow is best.

I am curious to hear how it goes, though!


#3

Hi @cole. Thanks a lot.
I came with the use of packrat::bundle() because I don't know the Internet conditions of the class room so I though it was a good idea to bring the bundle in a pendrive and pass to students to install the environment.

But now I believe packrat.lock is exactly what I'm looking for, the suggestion to version control packrat.lock is really awesome.
I will create a repo, pass to the students with the instructions and ask them to have the environment built before the class.
I will also create bundles for different platforms just in case someone else doesn't have it installed.

Thanks a lot.


#4

My pleasure! Glad it was helpful.

That makes a lot of sense - the internet struggle is real. I have heard the idea thrown around of bringing a wifi router into situations like that to create a local network for file transfers / downloads / package installs / etc. That's a lot of architecture with little pay-off, though!


#5

Probably overkill, but another option which would control more than the package versions is to use docker which now has a wrapper in R.


#6

OR you can use docker and packrat together :wink: Helpful if you ever want to rebuild your image with some semblance of similarity. I have to admit that I'm unfamiliar with the R wrapper around docker, though.