"hydrating" a packrat repository on a new computer?

github
reproducible
packrat
git

#1

I wrote a paper for a special reproducible issue of Computational Statistics. All the materials are actually available here, https://github.com/COSTDataExpo2013/AmeliaMN, including a packrat directory with the versions of packages I was using the last time I modified the code (~3 years ago). For example, dplyr version 0.4.1. I'd like to use these old package versions when I run the code, so I don't have to refactor everything. I think this is essentially the intended use-case of packrat!

In the intervening time, I moved institutions and no longer had the code on my computer. But, since everything is on GitHub I thought I'd be fine! I used the RStudio GUI to create a New Project from Version Control, and all the files were successfully populated onto my computer. But, the Version Control project creation walkthrough does not have a "use packrat with this package" checkbox. I would guess that ordinarily, RStudio would be looking for some sort of flag in the Rproj file or similar, to decide whether to use packrat or not. My packrat directory only has a src subdirectory, not a lib subdirectory, so I wonder if packrat has changed in the intervening years and I didn't have the right files there to flag it as a packrat directory? I do see an option to "git ignore packrat library," so maybe I actually have the right files, but not the right options elsewhere. Whatever the case, my project did not start out with "use packrat with this project" checked in the Project Options. I've tried a couple things, but can't seem to "hydrate" the project on my new computer.

Can someone give me some guidance about how to get all the old versions of packages hooked up correctly on my new computer? I tried adding the packrat directory to the "Local repositories" box in the Project Options, but that doesn't seem to have done it.


#2

If you have packrat/packrat.lock, you should be good to go! packrat::restore() from the project root should get you started! You will need to be able to compile packages from source, but that is a pretty usual requirement in R.

EDIT: Also, if you want your RStudio to automatically do this type of thing on startup, you just need to add the default packrat .Rprofile to version control in the future. It sources the packrat/init.R script to do this automatically when you start your R session.


#3

Ah, that helps! What is my recourse when one of the packages fails? In this case, it seems like Rcpp isn't installing properly. I don't know that I actually need Rcpp (probably just got wrapped into packrat because I had it on my old computer...) so is there a way to remove that as a package?

EDIT: Seems like Rcpp is a dependency for something I actually do need, and my issue is probably C++ related on the new machine. I'll struggle with this a bit more.


#4

Yeah, this is where packrat can become painful :slight_smile: Rcpp is necessary to build dplyr, for instance, so it got roped in as a dependency. You can remove it, but then your installation of dplyr will likely fail. One of the challenging things about R package management over time is that building packages from source is a firm requirement. Do you have more information you can share about how the package is failing to install? What operating system are you on? There is a fair amount of information out on the web on getting your system building packages from source.


#5

One of the problems is that I don't have complete admin access on this computer (Mac OS 10.13.6), but I've tried again on an account that seems to be mostly an admin. At the moment, I'm stuck at installing leaps, with the most relevant-seeming part of the error message,

ld: warning: directory not found for option '-L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin15/6.1.0'
ld: warning: directory not found for option '-L/usr/local/gfor

Google led me to this highly-upvoted SO answer by @kevinushey, but I don't have enough knowledge about package config files and/or system files on my computer to know what it means. Is FLIBS a file on my computer? Is it within some high-level directory, like usr/local or is it in the package directory?


#6

As an alternative path, I decided to try hydrating the repo on rstudio.cloud. The restore() command gets a little further, but it also gets stuck on installing maps. Again, the most relevant-seeming part of the error message is

/bin/bash: f: command not found

Again, this is over my head with what I might need to troubleshoot in order to get further. Any guidance would be much appreciated!

In case you're interested in the repo, here is the project I'm trying to get up and running again.


#7

FLIBS is one of the many variables passed to make by R during compilation of package sources; in particular, it's the set of Fortran libraries needed to link FORTRAN programs. You can see the full list of such variables here:

makeconf <- file.path(R.home("etc"), "Makeconf")
file.edit(makeconf)

The values within are the values R was configured with when R itself was compiled, but these can be overridden in various ways -- for example, by specifying them in your own ~/.R/Makevars file.

Most of these variables are documented in R-exts, although it's tough to hunt down what you need to know, if you don't know what you need to know :slight_smile: In this particular case, FLIBS is documented at:

https://cran.r-project.org/doc/manuals/r-release/R-exts.html#index-FLIBS


#8

It'd be helpful to see the entire output (assuming there are some preceding messages here) so we have a better idea of where this f is coming from.


#9

Okay, just to document for posterity/people who are at a similar computer literacy level to me-- what I did was in my Terminal app, ran

gfortran -print-file-name=libgfortran.dylib

and copied the result (a long path, /usr/local/Cellar/gcc/8.2.0/lib/gcc/8/gcc/x86_64-apple-darwin17.7.0/8.2.0/../../../libgfortran.dylib). Then, again in the Terminal, I ran

vi .R/Makevars

And edited the file so it had a line with

FLIBS=-L/usr/local/Cellar/gcc/8.2.0/lib/gcc/8/gcc/x86_64-apple-darwin17.7.0/8.2.0/../../../libgfortran.dylib

Where all that nonsense after the -L was the result of running the first command. After all that, the packrat::restore() got further, and I'm now in the same spot on my local machine as I was on cloud! Here is the full error message:

The command failed with output:
* installing *source* package 'maps' ...
** package 'maps' successfully unpacked and MD5 sums checked
** libs
** arch - 
clang -Wall -g -O2  -I/usr/local/include -L/usr/local/lib  Gmake.c   -o Gmake
clang -Wall -g -O2  -I/usr/local/include -L/usr/local/lib  Lmake.c   -o Lmake
Converting world to world2
f convert.awk < world.line > world2.line
/bin/sh: f: command not found
make: [world2.line] Error 127 (ignored)
make county.L state.L usa.L nz.L world.L world2.L italy.L france.L state.vbm.L state.carto.L
./Lmake 0 s b county.line county.linestats ../inst/mapdata/county.L
./Lmake 0 s b state.line state.linestats ../inst/m
In addition: Warning message:
In packrat::restore() :
  The most recent snapshot was generated using R version 3.1.2

#10

I see the same thing if I attempt to install that version of maps from source:

$ R CMD INSTALL --preclean maps_2.3-7.tar.gz
* installing to library ‘/Users/kevin/Library/R/3.5/library’
* installing *source* package ‘maps’ ...
** package ‘maps’ successfully unpacked and MD5 sums checked
rm -f mapget.o smooth.o thin.o Gmake Lmake world2.* maps.s[lo] maps.dylib *.exe maps.dll symbols.rds
rm -f -r ../inst ../libs
** libs
** arch -
clang -g -O3 -Wall -pedantic -I/usr/local/include -L/usr/local/lib  Gmake.c   -o Gmake
clang -g -O3 -Wall -pedantic -I/usr/local/include -L/usr/local/lib  Lmake.c   -o Lmake
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -g -O3 -Wall -pedantic -c mapget.c -o mapget.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -g -O3 -Wall -pedantic -c smooth.c -o smooth.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -g -O3 -Wall -pedantic -c thin.c -o thin.o
Converting world to world2
f convert.awk < world.line > world2.line
/bin/sh: f: command not found
make: [world2.line] Error 127 (ignored)
make county.N state.N usa.N nz.N world.N world2.N italy.N france.N state.vbm.N state.carto.N
make county.L state.L usa.L nz.L world.L world2.L italy.L france.L state.vbm.L state.carto.L
./Lmake 0 s b county.line county.linestats ../inst/mapdata/county.L
./Lmake 0 s b nz.line nz.linestats ../inst/mapdata/nz.L
./Lmake 0 s b usa.line usa.linestats ../inst/mapdata/usa.L
./Lmake 0 s b state.line state.linestats ../inst/mapdata/state.L
./Lmake 0 s b world.line world.linestats ../inst/mapdata/world.L
./Lmake 0 s b world2.line world2.linestats ../inst/mapdata/world2.L
./Lmake 0 s b italy.line italy.linestats ../inst/mapdata/italy.L
Cannot read left and right at line 1
make[1]: *** [world2.L] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [ldata] Error 2
make: *** Waiting for unfinished jobs....
ERROR: compilation failed for package ‘maps’
* removing ‘/Users/kevin/Library/R/3.5/library/maps’
* restoring previous ‘/Users/kevin/Library/R/3.5/library/maps’

However, the most recent version compiles without issue:

$ R CMD INSTALL --preclean maps_3.3.0.tar.gz
* installing to library ‘/Users/kevin/Library/R/3.5/library’
* installing *source* package ‘maps’ ...
** package ‘maps’ successfully unpacked and MD5 sums checked
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
configure: creating ./config.status
config.status: creating src/Makefile
** libs
** arch -
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile init.o mapclip.o mapget.o smooth.o thin.o
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile Gmake
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile Lmake
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile world2.line
clang -Wall -g -O2  -I/usr/local/include -L/usr/local/lib  Gmake.c   -o Gmake
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c init.c -o init.o
clang -Wall -g -O2  -I/usr/local/include -L/usr/local/lib  Lmake.c   -o Lmake
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c mapclip.c -o mapclip.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c mapget.c -o mapget.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c smooth.c -o smooth.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c thin.c -o thin.o
Converting world to world2
awk -f ./convert.awk < world.line > world2.line
Creating legacy world2 database
awk -f ./legacy_convert.awk < legacy_world.line > legacy_world2.line
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile county.L state.L usa.L nz.L world.L world2.L italy.L france.L state.vbm.L state.carto.L legacy_world.L legacy_world2.L lakes.L
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile county.N state.N usa.N nz.N world.N world2.N italy.N france.N state.vbm.N state.carto.N legacy_world.N legacy_world2.N lakes.N
"/Library/Frameworks/R.framework/Resources/bin/R" CMD SHLIB -o maps.so init.o mapclip.o mapget.o smooth.o thin.o
./Lmake 0 s b county.line county.linestats ../inst/mapdata/county.L
./Lmake 0 s b usa.line usa.linestats ../inst/mapdata/usa.L
./Lmake 0 s b state.line state.linestats ../inst/mapdata/state.L
./Lmake 0 s b nz.line nz.linestats ../inst/mapdata/nz.L
./Lmake 0 s b world.line world.linestats ../inst/mapdata/world.L
./Lmake 0 s b world2.line world2.linestats ../inst/mapdata/world2.L
./Lmake 0 s b italy.line italy.linestats ../inst/mapdata/italy.L
./Lmake 0 s b france.line france.linestats ../inst/mapdata/france.L
./Lmake 0 p b state.vbm.line state.vbm.linestats ../inst/mapdata/state.vbm.L
./Lmake 0 p b state.carto.line state.carto.linestats ../inst/mapdata/state.carto.L
./Lmake 0 s b legacy_world.line legacy_world.linestats ../inst/mapdata/legacy_world.L
./Lmake 0 s b legacy_world2.line legacy_world2.linestats ../inst/mapdata/legacy_world2.L
./Lmake 0 s b lakes.line lakes.linestats ../inst/mapdata/lakes.L
make[1]: warning: -jN forced in submake: disabling jobserver mode.
clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o maps.so init.o mapclip.o mapget.o smooth.o thin.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: text-based stub file /System/Library/Frameworks//CoreFoundation.framework/CoreFoundation.tbd and library file /System/Library/Frameworks//CoreFoundation.framework/CoreFoundation are out of sync. Falling back to library file for linking.
make -f "/Library/Frameworks/R.framework/Resources/etc/Makeconf" -f Makefile county.G state.G usa.G nz.G world.G world2.G italy.G france.G state.vbm.G state.carto.G legacy_world.G legacy_world2.G lakes.G
./Gmake b county.gon county.gonstats ../inst/mapdata/county.G ../inst/mapdata/county.L
./Gmake b state.gon state.gonstats ../inst/mapdata/state.G ../inst/mapdata/state.L
./Gmake b usa.gon usa.gonstats ../inst/mapdata/usa.G ../inst/mapdata/usa.L
./Gmake b nz.gon nz.gonstats ../inst/mapdata/nz.G ../inst/mapdata/nz.L
./Gmake b world.gon world.gonstats ../inst/mapdata/world.G ../inst/mapdata/world.L
./Gmake b world2.gon world2.gonstats ../inst/mapdata/world2.G ../inst/mapdata/world2.L
./Gmake b italy.gon italy.gonstats ../inst/mapdata/italy.G ../inst/mapdata/italy.L
./Gmake b france.gon france.gonstats ../inst/mapdata/france.G ../inst/mapdata/france.L
./Gmake b state.vbm.gon state.vbm.gonstats ../inst/mapdata/state.vbm.G ../inst/mapdata/state.vbm.L
./Gmake b state.carto.gon state.carto.gonstats ../inst/mapdata/state.carto.G ../inst/mapdata/state.carto.L
./Gmake b legacy_world.gon legacy_world.gonstats ../inst/mapdata/legacy_world.G ../inst/mapdata/legacy_world.L
./Gmake b legacy_world2.gon legacy_world2.gonstats ../inst/mapdata/legacy_world2.G ../inst/mapdata/legacy_world2.L
./Gmake b lakes.gon lakes.gonstats ../inst/mapdata/lakes.G ../inst/mapdata/lakes.L
installing to /Users/kevin/Library/R/3.5/library/maps/libs
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (maps)

I'm guessing the old version of the maps package did not attempt to discover an awk installation, and so the onus is on the user to set e.g.

AWK = /usr/bin/awk

in their own ~/.R/Makevars file.


#11

And just to confirm, inspecting the src/Makefile of the source package for maps, I see:

world2.line: world.line
	@$(ECHO) "Converting world to world2"
	$(AWK) -f convert.awk < world.line > world2.line
	@$(CP) world.linestats world2.linestats
	@$(CP) world.gon world2.gon
	@$(CP) world.gonstats world2.gonstats
	@$(CP) world.name world2.name

which means the package is expecting the AWK make variable to be defined. (I'm not sure if older versions of R happened to define it, but it appears to no longer be defined?)

This all comes down to one of the core issues with Packrat: it attempts to compile packages from sources, but compiling from source is not always trivial. In many cases, compiling packages from sources successfully (especially with older versions of R) will require some monkeying around to get things right.


#12

Monkeying around is right! I'm kind of enjoying trying to troubleshoot this on my computer and RStudio Cloud simultaneously... Gives additional information!

To simplify the list of things to troubleshoot, I manually edited my packrat.lock file to remove some packages I know I don't use. It also seems like it helps to use an older version of R, so I've switched to 3.1.3 on the cloud, and 3.2.1 on my computer. Both are now so close to working, but are stuck on a mismatch between Rcpp and dplyr. From Cloud:

Installing dplyr (0.4.1) ... 
Error: Command failed (1)

Failed to run system command:

	'/opt/R/3.1.3/lib/R/bin/R' --vanilla CMD INSTALL '/tmp/Rtmpx1UZdW/dplyr' --library='/cloud/project/packrat/lib/x86_64-pc-linux-gnu/3.1.3' --install-tests --no-docs --no-multiarch --no-demo 

The command failed with output:
* installing *source* package 'dplyr' ...
** package 'dplyr' successfully unpacked and MD5 sums checked
** libs
Error: package 'Rcpp' 0.11.2 was found, but >= 0.11.3 is required by 'dplyr'
* removing '/cloud/project/packrat/lib/x86_64-pc-linux-gnu/3.1.3/dplyr'
In addition: Warning message:
In packrat::restore() :
  The most recent snapshot was generated using R version 3.1.2

If it's true that dplyr needs Rcpp>0.11.3, how did I get into this mismatched state for the packrat to snapshot back in the day?? Maybe I installed something from github? In any case, what's the best strategy for fixing it?


#13

Oof. I don't have a good idea how the lockfile might've gotten into this state in the first place. Maybe if Rcpp 0.11.3 was installed, then dplyr 0.4.1 installed, then Rcpp was rolled back to 0.11.2? But that would've implied the project was, at snapshot time, in a broken state where dplyr wouldn't be able to load, so that's fairly surprising.

My only recommendation would be to try just changing the Rcpp version in the lockfile to 0.11.3 and crossing your fingers, but that's obviously not ideal...


#14

Ah, that seems to have solved the problem! I didn't realize I could manually change the version in the lockfile, I assumed that hash had some correspondence to the version, but it seems not to.


#15

The hash is only used during package caching; this is an optional feature and not enabled by default for Packrat projects. (It's a mechanism whereby multiple projects can share installed packages, so that future packrat::restore()s are less time-consuming).

This is something we should communicate more clearly; I don't think we explicitly document any of the fields used in the lockfile anywhere.