Using R and conda

I am using R on a shared server. Not having root permissions makes some things difficult. I recently started using conda, which has been working well for specific tools. Everything python-based worked for me, which makes sense considering it was developed by python users. I recently needed to use an R package that had python-based dependencies. It seemed like a great use-case for conda. Unfortunately, I keep running into problems with R.

  • The most obvious is that R 3.5 is still not available (after 3 months) on either conda-forge or r channels. This is not a problem for most packages, but the latest version of Bioconductor needs the latest version of R.
  • When R is available, it can be either the standard version or the Microsoft R. MRO is fine, but it has its own caveats. It also splits the entire community, since different conda packages depend on different R flavors.
  • My original plan was to use r-base and just install all the packages normally (via install.packages() or biocLite(). Unfortunately, I quickly ran into problems with packages like rJava and rlang which had external dependencies. I couldn't even install some mainstream packages like tidyverse. You can install individual R packages as conda packages, but if you have several, you quickly run into dependency problems. After a few, it wanted me to downgrade to R 3.3. I should admit that the biggest problem for me was probably hdf5r, which is more obscure, so fewer combinations are possible. What's frustrating is that depending on the combinations, the results can be very different.

I looked around to see if there are any tutorials or suggestions. All the guides show how to install one package, which not surprisingly works fine. As soon as you try to make any customizations, errors start appearing. Am I doing it wrong or is it really this hard? Is there some community effort to improve the situation? For example, it would not be completely unreasonable if RStudio had created a "base R" package that included a clean copy of R with all the common dependencies.

From what I hear, this is a pretty common story whenever someone tries to use conda R. Theoretically, I'm sure it's possible to get it to work right, but I suspect it'd take a non-negligible amount of work that nobody who knows the requisite turf appears inclined to do.

If you can get it to do what you want without going crazy, great. Otherwise, it's worth asking why you want to use conda. If you want a programmatic way to install and update, there are more general package managers that can install CRAN R (built-in in Linux, Homebrew Cask on Mac, maybe choco on Windows). If you want package checkpointing to avoid conflicts (not that I've seen...any, really) using MRAN or pacman or similar may help. For self-contained environments, the rocker docker images are handy. If you want to enable R kernels in Jupyter, you don't actually need conda R for that.

Regardless of your installation, installing some things—e.g. RJava—will likely still be a pain. (It does enable some cool packages, though.)

1 Like

To clarify, I would like to have a custom python and R environment on a server without root access. The ideal answer to the first part is probably Docker, but the second part makes that impossible. Conda seems to be a perfect alternative in theory, but as I highlighted above, it's easy to run into dead-ends.

Since I initially posted this, I started experimenting with more complete Conda packages. For example, those for packages with extensive dependencies. Ironically, some of them install, but then the only package they are designed for does not actually load.

As I mentioned elsewhere, Nix is a great way to install and maintain software with dependencies. The drawback is you (briefly) need root access during the initial install to create /nix directory. After that you don't need root access ever. The upside is everything you install with nix is compatible so, for instance if you do

nix-shell -p R rPackages.dplyr rPackages.rJava python pythonPackages.numpy --run R

you will get a sandbox environment with R, dplyr, rJava (with correct version of JRE), python, and numpy where everything works regardless of whether your host OS have correct java, shared libraries, etc. or not.

We do have anaconda for purely python projects because the learning curve is lower but it is often painful to use for the reasons you mentioned so we are gradually moving towards using nix for everything.

1 Like

I looked at using Nix for this as @alexv suggested, but I'm trying to set this up in a Docker environment and Nix doesn't work well there (see https://github.com/NixOS/nix/issues/971).

I don't use docker but as far as I understand the issue is for a very specific and niche use case, namely installing nix in a container from scratch by using the official installer. Is there a reason you need that instead of just using the standard nix docker https://hub.docker.com/r/nixos/nix/ and adding your stuff on top?

@alexv - yes, I was hoping to use Nix as a tool for existing Docker environments, which are variously built on CentOS, or on Debian, or Amazon Linux, or Alpine, or whatever. A pre-made Nix image just doesn't let me start with the appropriate platform.

To me, using a pre-made image is much more of a "niche" use case - a much more standard way of using a piece of software is just to install that software, not download a whole OS image with it pre-installed, right?

@KenWilliams Can you now use Nix package manager in a non-Nix-based Docker container? It seems like very recent commits have resolved the issue, but I haven't tried it yet. I might try to use Nix w/ rocker images sometime in the future.

If anyone has a specific R package that they are having trouble installing with conda, I am happy to help troubleshoot. The support for R from conda, and especially the community channel conda-forge, is continuously improving (note that I help maintain the R packages on conda-forge).

For example, the code below creates a conda environment with R 3.6.1, the tidyverse, and difficult to install packages like rlang and rJava (including their system dependencies, one of the most useful benefits of using a package manager like conda):

conda create -n r-test r-base r-tidyverse r-rlang r-rjava r-hdf5r
conda activate r-test
Rscript -e 'library(tidyverse)'
conda deactivate

I ran the above using conda version 4.7.10 on Ubuntu 18.04. Almost the same exact code can also be run on Windows to install the same environment (the exception is hdf5r, which we haven't been able to successfully build on Windows yet).

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.