Is it normal to have to install most packages from source in Ubuntu R?

I got to like R under Windows, and I got to love it under OSX. Installation is very simple, especially now that I'm able to link high performance BLAS also in OSX (previously I used MRAN in Windows to achieve the same results).

It turns out I was spoiled: R in Linux (in Docker, actually, but let's consider just Ubuntu for now) is much more annoying, which surprised me (I always thought Linux to be the Emperor of Scientific Computing Environments). I can't even get the latest R binaries. I had a look at

https://cran.r-project.org/bin/linux/ubuntu/README.html

specifically,

R 3.5 packages for Ubuntu on i386 and amd64 are available for [...] Bionic Beaver (18.04;LTS). Note, to install R 3.5 packages, a different sources.list entry is needed. See below for details.

To obtain the latest R 3.5 packages, add an entry like

deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/

in your /etc/apt/sources.list file.

My understanding is that I can use 3.5 packages with Ubuntu 18.04, and I have to modify /etc/apt/sources.list in order to do that. However, if I go to

https://packages.ubuntu.com/search?suite=bionic&searchon=names&keywords=r-base

it's evident that the r-base apt package for Ubuntu is 3.4.4. And as a matter of fact, that's what

apt-get install -y r-base

gets me. What am I doing wrong?

Also, it looks like packages are always installed from source. When installing zoo, that's not a big deal, but installing janitor felt like installing Ubuntu itself :scream: I can only shudder at the thought of installing the tidyverse from source! Is there any obvious shortcut I'm missing?

Speaking only to the add-on package installation aspect, see the R Installation and Administration Guide:

What install.packages does by default is different on Unix-alikes (except macOS) and Windows. On Unix-alikes it consults the list of available source packages on CRAN (or other repository/ies), downloads the latest version of the package sources, and installs them (via R CMD INSTALL ).

So yes, I guess I'd say it's normal in the sense of "as expected". CRAN's archive of binary packages is made available on an unsupported basis and maintained by one person for each platform (macOS and Windows; see: https://cran.r-project.org/web/packages/policies.html#Binary-packages and, e.g., https://cran.r-project.org/bin/windows/contrib/r-release/@ReadMe). Basically, it started as a generous convenience for users on systems where it wasn't common or easy to have all the necessary tools to build packages that need compilation from source (as always, installing packages written in pure R from source is trivial).

A local repository can offer pre-compiled binaries built against whatever systems are relevant to the repository's users (something like this is in action on RStudio Cloud —except they're using RStudio Package Manager). I don't run Linux day-to-day myself, so I don't know if there are other non-CRAN repos hosting R add-on packages that are popular with Linux users (and that stay up to date enough to be useful — a major consideration!). Maybe others will chime in? The Linux users I know install from source and go get :coffee: :grin:.

6 Likes

I'm a long-time OSx user, but I'm turning more and more to Ubuntu just for the pleasure of waiting for compilation, which ensures the latest version, depends on no volunteer maintainer's schedule or XCode translation skills, and is about as side-effect free as you could possibly hope.

My rule of thumb is that if I'm offered a later source version to take it, and if it fails to fall back on the latest precompiled binary. Horseback guess is it works 60% of the time first attempt, 75% of the second attempt and by the third attempt, there's a binary.

By the third attempt I already left R and started using Python, for which package binaries are readily available under Ubuntu :stuck_out_tongue_winking_eye: I have a ton of deadlines to respect, and co-workers already don't like R, so I dont need the aggravation of compiling each package from source. Anyway, I'll make some more tests, and if I manage to find a time-savy solution, great, otherwise I'll just yield to peer pressure and use Python on the remote server, while continuing to use R on my trusty Mac :slight_smile:

PS using always the latest version of a package as soon as it's introduced, may expose you to more bugs and it makes your analyses less reproducible. Binaries are not as cool, but they have their advantages :wink:

These two quotes of yours made my day - it's indeed a very generous contribution (Simon Urbanek, if you read this, thank you, you have my gratitude :pray:), and I could consider getting a :coffee: while compiling :slight_smile:

2 Likes

A few comments on running R Studio on a Ubuntu based server:

  • yes, it is normal for the packages to take a while to compile from source; this is rarely a problem (we are talking about 5 minutes max.)
  • what can be a problem is getting hold of required external binaries and installing them at OS level (this requires sudo access, which can be a problem for production or production-like environments)
  • a very useful feature of server edition is the site library - this in effect means that 1) you do the compiling from source only once for all your users and 2) you have "one truth" version for all users on your team - ensuring that all your code breaks simultaneously :slightly_smiling_face:
4 Likes

5 minutes per package, and I'm used to using a lot of packages when performing EDAs on Big Data.

:joy: hey, I'm an optimist at heart and I see this as ensuring that all our code runs smoothly simultaneously! But seriously, thanks a lot for your explanations about RStudio Server! Thanks to your and @kevinushey info, 'm totally sold on it - if you happen to be at NIPS, beers are on me (or coke, or pizza/salad/whatever you're into).

1 Like

jlacko, jcblum and technocrat,

thanks to all of you for clarifying my doubts about R packages and Linux! I chose jlacko's answer as a solution, because I think RStudio Server suits my use case best, but I'm really thankful to all of them for your help.

PS I considered @mentioning you all in the post to make sure you would get my thanks, but that feature shouldn't be abused according to the community's guidelines

1 Like

Python has a big advantage over R in a production environment in which you have to turn over a toolchain model for scaling in a compiled language to the DevOps part of the shop. It's imperative/procedural, just like most of the popular compiled languages.

The internet has a way of compressing the globe, but out there in the physical world it is still a big place. So let me hum quietly (it kinda sorta rhymes with Tipperary) "It's a long way, to NIPS from Prague, it's a long way to go..." and raise the symbolic glass in general westerly direction :beers:

2 Likes

:grin: it's a long way for me too! And, unlike for you, who could find the weather familiar, it's a very cold way for me :cold_face: Montreal in December...OMG! Let my raise my glass in reciprocation :beer: na zdraví!

2 Likes

I would just like to point everyone to my c2d4u PPA, which has almost 4,000 R packages for LTS Ubuntu releases. The Trusty repository is starting to show its age, but the Xenial and Bionic are update to date. Usually updated every week.

R 3.5: https://launchpad.net/~marutter/+archive/ubuntu/c2d4u3.5
R 3.4: https://launchpad.net/~marutter/+archive/ubuntu/c2d4u

Also, binaries complied for R 3.4 are not compatible with R 3.5. Needed two repositories so that I didn't break systems with incompatible packages.

6 Likes