Installing a package that uses git-lfs to store large files

We have some R packages containing data that we would like to distribute to support reproducible research. The packages are hosted on github (e.g. https://github.com/jefferislab/lhns) and some are large enough (700 MB data) that we are experimenting with git lfs to store the data objects. Some future packages might be several GB. Right now the only way that we have found for an end user to install these packages is

  1. install git-lfs client
  2. clone repository
  3. devtools::install the local checkout of the repository

This is not a very satisfactory workflow for less experienced users and breaks automated dependency management. Has anyone found a better way of dealing with git-lfs? Pointers to other approaches to distributing large versioned data for R in a painless way for end users are also appreciated.

With many thanks,

Greg Jefferis,
MRC LMB and University of Cambridge

4 Likes

Another option might beto put the data on a data repository that has an R package to address it's API. There are several options (zenodo, figshare, dataverse, etc), but the one I often use for this kind of thing is https://github.com/CenterForOpenScience/osfr. Then the end user can do something like osfr::download_file('zevw2', 'test123.md')

3 Likes