Syncing Packages Across Machines


#1

What's the best practice for keeping R packages synced across multiple machines?


#2

Do you mean you have 2 computers (for example) and you want to have them both have identical versions of R packages? Or is it more local dev station/staging/production type of deal? The answer is probably different for these 2 use-cases.


#3

@mishabalyasin Yes, I have two computers and want to sync an identical library of R packages across them both.


#4

There are a couple different ways I can think you might be able to do this. As for best practice. It depends on what your resources are and how critical it is to you. I'd say a centralized, load balanced (R Studio Server, SQL Server 2016+, etc) would be a best practice scenario.

1. Docker

Here you can create a core environment that will be identical between both systems. This ensures that the environment stays the same and there are no changes between them. Check out the following tutorial.

2. Versioned R Script

You could write a script that will check / download to ensure the right packages have been setup at least at every startup. Following link will give you an idea on where to do here.

3. Controlled Package Environment w/ Versioned R Script

You could use RStudio's new Package manager, configure the install repository to the package manager and then eliminate any other repositories. That way any script will be installed from a particular regulated source.

4. RStudio Server

Centralized computation is never a bad thing, especially if you have the budget for it.

Hope that helps!


#5

To add a few more options:

  1. miniCRAN - https://cran.r-project.org/web/packages/miniCRAN/index.html. Here you choose which packages end up in the repo.
  2. checkpoint - https://cran.r-project.org/web/packages/checkpoint/index.html. Here you choose point in time (for example, 2nd of December) and then all packages will be installed only as they were at this point of time.

Both of them allow to create a way that your R installation will only "see" packages from controlled (by you) subset of packages.