I'm dreaming of a (hosted) RStudio CI/CD

shiny
docker
shinyappsio
travis-ci
dependencies

#1

I love R and the RStudio/tidyverse ecosystem, but despite the fact that I am a full-time R dev now, I spend way too much time managing (R and external) dependencies and maintaining R environments, including:

  • my local machine
  • shinyapps.io (for apps, both for projects and for packages)
  • travis-ci.com (for R CMD build but also general project CI/CD)
  • (soon) our own RStudio Connect server
  • at some point (god forbid) docker container with all that stuff deployed to clients

I feel like I am solving the same problems (R and system dependencies) over, and over, and over again – or at least more than just once. This is now at a point where it's really hurting the viability of my work in the ecosystem (perhaps my fault).

The recent Ubuntu / R 3.5.0 shenanigan might be example for this problem: my deploys failed b/c travis-ci docker container (unaccountably?) already had R 3.5.0, but shinyapps.io did not (not RStudios fault). So the two environments where out of whack, and I had to figure out why. It's a pretty minor thing that was quickly solved, but these things multiply with each R environment you have to worry about.

I know some people might get by just with deploying from within RStudio (rsconnect::...) but I would guess that with some collaborators, this kind of process gets weird fast (without any CI/CD).

I'm not here to complain or anything, I just wanted to find out whether other people were facing the same challenges, or whether I was somehow doing it wrong.
Perhaps, if I don't have this completely wrong, it might also be interesting for RStudio to gauge interest in related features or products.

I haven't thought this through, and I don't understand much of the scaffolding behind it, so this may be completely ignorant, but, I've been dreaming for some time of a (hosted) RStudio CI/CD product (with deployment to shinyapps.io and perhaps https://rstudio.cloud at some point).

I understand that replicating something like Travis or Jenkins just for R might be insane scope creep, but I'd just really love to only have to worry about one canonical R build and deploy environment.

Several ideas, in increasing complexity come to mind:

  • make shinyapps.io, rstudio.cloud and self-hosted connect travis-ci deployment providers.
  • let shinyapps.io and connect take its dependencies optionally from DESCRIPTION (not packrat, which can be hard to reason about). DESCRIPTION at least, is already used on the travis-ci R docker container and, of course, R packages, so that would make things a little easier. (I understand that packrat is way more powerful than DESCRIPTION deps, and DESCRIPTION does not ensure availability of old binaries (though Microsofts timemachine does). Packrat is just so complex, it often seems to cause more problems than it solves.)
  • let rstudio.cloud (and connect) commit to github from inside the respective browser UIs, instead of just save. That would really dramatically lower the cost of entry.
  • create (or just document?) an easy way for me to download and run the shinyapps.io docker image, so I can debug it interactively on my machine (or just switch all my development into that container, so I don't have to worry about my local environment).
  • I would also be quite happy to cut travis out of the loop entirely, if only RStudio (Connect?) had a hosted variant and (better) CI/CD integration. (I previously mentioned this in ticket http://support.rstudio.com/hc/requests/19752 and Jeff Allen had some encouraging words).

I just want to be able to have a project (or package) locally, to write up the dependencies once and then to push it to some well-defined cloud environment at RStudio (or on-prem), and have it be tested and deployed there, all from the same environment, be it hosting a shinyapp, an RStudio Connect product, some bookdown deployment, or even an R CMD check for a CRAN-bound package.

I can't stress how happy I would be to pay for this on a monthly basis, even just for the CI/CD service, that'd be easily worth 50-100$ for me, just as long as I can cut down on the dependency hell.

Hope this is meaningful/helpful and I'm not being too unclear/incompetent.


#2

Thanks @maxheld83, we have been hearing from more and more people that they would like help with getting pre-built binaries for Linux for both R and the packages. It sounds like what you are looking for here is the ability to define your bundle, get it tested in the cloud, and then be able to use it in a variety of places. What I am less clear on is whether all your use cases would be in the cloud, or whether you would want to be able to download the docker container that serves as a foundation for your work.

It may make sense to schedule a 30 minute call with a few people here to see if we can flesh out the requirements a bit more and make sure we understand the ask. If there others who are interested, please let us know. I would love to see how we can help.


#3

oh wow, thanks for taking the time to respond to this @tareef.

Let me try and explain a bit more (shorter, though :slight_smile:).

As I imagine many people, I currently have a pipeline that involves rendering (*.Rmd) or compiling (packages) R projects on:

  • local machine
  • travis (for CI/CD from GitHub)
  • shinyapps or rstudio connect or shiny server pro

each of which are at least subtly different R environments. This increases complexity, and causes hickups such as last weeks mismatch b/w (apparently) shinyapps and travis' R docker image, one of which had 3.5.0, the other didn't, which borked up the app deploy (The delay on shinyapps isn't the issue; the mismatch is).

This is further complicated by the way in which dependencies (both R and beyond) are specified:

  • local: homebrew or apt-get, and just random install.packages(), or of course packrat.
  • travis: DESCRIPTION for R packages, and .travis.yml for system dependencies (mostly apt-get)
  • shinyapps: shinyapps-package-dependencies for system deps, and dark magic packrat for R packages.

This just opens up a lot of sources for confusion, and I sometimes feel like I am solving problems several times.

I my not understand enough about the tech underneath to make a good suggestion how to improve this.

But: using the same docker img on my machine and on shinyapps (and hosted connect, and http://rstudio.cloud) would be fantastic indeed. Perhaps even integrated into RStudio, with a short pointer in the .Rproj file (or DESCRIPTION or somewhere else) to which version of the canonical RStudio docker img I currently want.

Then, to have some CI/CD, it'd be really nice to be able to check and deploy directly from GitHub, and have the some RStudio service launch and run, check, render and deploy my R-project (with reference to the container), with some minimal (travis/jenkins-like) UI (and gh integration) which tells me the build status for every commit.

People who want to go off into the docker deep end can of course already do this. But I imagine that I'm not the only person who doesn't want to provision my own docker image (even off of the popular R docker images). I'd actually prefer a canonical docker image maintained by RStudio, with limited dependencies (much like
shinyapps-package-dependencies actually).

Am I making any sense?


#4

Yes, I think I understand the use case. Essentially, it would be nice if you could configure your working environment in one place, and then essentially make it usable in a variety of other places, but have someone else worry about how to make that docker image available in these different places. Do i have it mostly right?


#5

yes, exactly.

I also wouldn't mind using working environment (~ Docker image?), curated by RStudio, with reasonable (but limited) default system dependencies, and a carefully tested migration to new R releases and such.

I like the way the shinyapps.io images are limited, tested and maintained by RStudio. (The shinyapps-package-dependencies repo also seems to be an elegant and rigorous way to add edge case sys deps).

Perhaps this image might even default to a set of (nonconflicting) versions of some core R packages (say, tidyverse + r-lib).

Basically, I'd be happy if there was an easy way for me to use the shinyapps image locally in RStudio (via docker), as well as on some (RStudio?) hosted CI/CD service.

Lastly, I'd love it if rsconnect::deploy... would also allow me to use DESCRIPTION or Microsoft's checkpoint package to specify R dependencies. Anything but packrat under the hood, which I apparently am too stupid to use. This is related to the above concern, because (to me) the packrat-black-magic in rsconnect::deploy...ments greatly complicates things by adding another orthogonal dependency management.