Packaging of native extensions - vendoring of dependencies

Hello, world!

This is my first post here. Apologies if I am breaking any rule.

I am the maintainer of the xtensor package on CRAN. xtensor is a tool for manipulating N-D arrays with a high-level API in C++. It also has bindings to Python and Julia.

  • The CRAN package for xtensor vendors the headers of xtensor-r, and its other dependencies xtensor and xtl.
  • The tarball uploaded to CRAN is built from the source repository with a special cmake target. make cran produces the tarball that we submit to CRAN. Having such a build step allows us to not include dependencies in the repository, or the files generated by Rcpp.

Now we would like to improve this in several ways

  • separate the R package repository from the xtensor-r pure C++ repository (started here https://github.com/QuantStack/Xtensor.R), and probably drop cmake and use something else to generate the tarball with all the vendoring.
  • this new R package should be installable from GitHub with devtools, but ideally not vendor xtensor-r, xtensor, xtl in the repository.
  • package it for conda, in a way that does not vendor xtensor-r, xtensor, xtl. The conda package will simply depend on the conda packages for xtensor-r, xtensor and xtl which already exist.

Hence comes the questions: is there a means to perform custom actions upon R CMD build? (such as fetching resources to be vendored with the package), and to add custom options to the command? (such as --no-vendoring)

1 Like

I don't think there is a perfect solution to doing this in R CMD build, but there are some other options.

Under a Unix-alike only, an executable (Bourne shell) script cleanup is executed as the last thing by R CMD INSTALL if option --clean was given, and by R CMD build when preparing the package for building from its source. [1]

So you could potentially put these steps in a cleanup script, you would have to somehow condition the vendering to occur only when run during R CMD build.

Alternatively you could download the dependencies during R CMD INSTALL instead by putting them in the configure script, this is probably the most straightforward solution, but of course assumes internet access during installation. Note if you go this route configure is only run for source installations, so users installing the binaries would not need to download the vendored dependencies. This is probably the route I would recommend.

See the configure example in Writing R Extensions for more details on using and customizing configuration.

4 Likes

Thank you for your reply! This is very helpful.

The configure route looks like a good one. The tarball for CRAN could still have the vendored dependency if the configure script can check for their presence before downloading them, which would solves the issue of installing a package without access to the internet.

Do you know if there is a means to pass options to the configure script? I would like to be able to completely prevent R CMD INSTALL from fetching resources online when with an option. I would use this option when building the conda package (or the apt-get, or yum package) which can depend on the packages for xtensor, xtl etc...

Do you know if there is a means to pass options to the configure script?

Yes, the configure example I linked to above shows how you can do this, so the user can do something like

R CMD INSTALL
--configure-args='--with-odbc-include=/opt/local/include
--with-odbc-lib=/opt/local/lib --with-odbc-manager=iodbc'
RODBC

1 Like

Thanks, this is exactly what I needed. Thanks for taking the time to answer.

If your question's been answered, can you please mark the solution? It's helpful for others who may have the same question in the future.

Here’s how to mark a solution: FAQ: How do I mark a solution?

Thanks!

Done! Thanks again, I did not know about this feature.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.