RStudio Package Manager with Nexus Repository 👍

The public RStudio Package Manager providing binary packages for Linux is great! Adding in a local caching solution makes it even better!

On a clean install of R-4.0.2 on a fresh Ubuntu 20.04 within a WSL2 environment and using usethis as a test package (like pak demonstrates) testing install times...

# usethis package install times.
 2.66m Full - Source - To find out I had missing depends that
                       RSPM allowed me to _easily_ identify and install!
 3.99m Full - Source - Not Cached

26.98s Full - Binary - Not-Cached
10.22s Full - Binary - Cached
 1.02s Partial (mime) - Binary - Cached

I tried a bunch of caching solutions and ended up with apt-cacher-ng and Nexus running on the same WSL2 Distro with startup controlled by systemd. All of the regular apt repos get cached in apt-cacher-ng and focal-cran40 along with the RStudio Package Manager packages get cached in Nexus.

By "a bunch" I mean I worked until I had 'a' solution working with Artifactory (had to use Pro trial), Squid, Varnish, nginx, mitmproxy and even tried pak (pkgcache) but I didn't end up trying mran. It became obvious that a R-package only solution, a solution on the same Distro/container/client or one requiring termination setup wouldn't be either a valid solution or be easily reproducible.

It'd be awesome if a limited version of RStudio Package Manager (with none of the Enterprise features like access control, auditing, etc) came down the pipeline for students/home-users to use in a limited access environment. (WSL2/docker IP constraints or something) but I suppose that's just wishful thinking. :wink:

Anyhow... Thank You Again RStudio Team!


I'll be posting some powershell and bash scripts soonish.
By default Nexus doesn't pass along the client user agent so a nexus proxy is truly getting setup for a particular R/OS build which hard-codes the user-agent string to send.

options(repos = c(REPO_NAME = "http://localhost:8081/repository/rspm-focal-4.0-binary/"))

goes to the proxy without concern of the User-Agent string which then requests from

"remoteUrl": "https://packagemanager.rstudio.com/all/__linux__/focal/latest",

and passes along

"userAgentSuffix": "R/4.0.2 R (4.0.2 x86_64-pc-linux-gnu x86_64 linux-gnu)",

1 Like

WSL Proxy Setup

This is just a quick little instructional for setting up apt-cacher-ng and Nexus Repository 3 on a WSL2 Debian distro.

Proxy Creation

Install the distro from the app store and run once to create a default user. Then execute the following on the distro (cut and paste into a shell script, terminal, or whatever it is you like to do).


sudo apt update

sudo apt -y --no-install-recommends install curl gnupg software-properties-common

URL=https://github.com/DamionGans/ubuntu-wsl2-systemd-script/archive/master.tar.gz

curl -kL ${URL}  --output - | tar zxvf -

cd ./ubuntu-wsl2-systemd-script-master/

bash ./ubuntu-wsl2-systemd-script.sh

URL=https://adoptopenjdk.jfrog.io/adoptopenjdk/api/gpg/key/public

curl -kL ${URL}  --output - | sudo apt-key add -

sudo add-apt-repository --yes https://adoptopenjdk.jfrog.io/adoptopenjdk/deb/

sudo apt update

sudo apt -y install apt-cacher-ng adoptopenjdk-8-hotspot-jre

sudo cp /lib/systemd/system/apt-cacher-ng.service /etc/systemd/system/apt-cacher-ng.service

sudo sed -i '/^ExecStart=/ s/$/ DontCache=".*localhost.*"/' /etc/systemd/system/apt-cacher-ng.service

cd /opt

URL=https://download.sonatype.com/nexus/3/latest-unix.tar.gz

curl -kL ${URL} --output - | sudo tar zxf -

sudo mv /opt/nexus-* /opt/nexus

NEXUS_DATA=/opt/sonatype-work/nexus3

sudo useradd -r -m -c "nexus role account" -d ${NEXUS_DATA} -s /bin/false nexus

sudo chown -R nexus:nexus ${NEXUS_DATA}

cd ~

cat > ~/nexus.service << EOF

[Unit]

Description=nexus service

After=network.target

  

[Service]

Type=forking

LimitNOFILE=65536

ExecStart=/opt/nexus/bin/nexus start

ExecStop=/opt/nexus/bin/nexus stop

User=nexus

Restart=on-abort

TimeoutSec=600

  

[Install]

WantedBy=multi-user.target

EOF

sudo mv ~/nexus.service /etc/systemd/system/

sudo chmod 0644 /etc/systemd/system/nexus.service

sudo systemctl enable nexus

Exit and restart the distro.

That was the primary setup, now we'll create the nexus repositories for the CRAN binaries and RStudio Package Manager binaries.

Repository Creation

Start the Debian distro again.


# Wait for Nexus startup to complete.

sed "/^Started Sonatype Nexus.*$/ q" <(tail -f /opt/sonatype-work/nexus3/log/nexus.log)

REPO_JSON_PATH="/tmp/repo.json"

cat > ${REPO_JSON_PATH} << \EOF

{

  "name": "focal-cran-4.0-proxy",

  "online": true,

  "storage": {

    "blobStoreName": "default",

    "strictContentTypeValidation": true

  },

  "cleanup": null,

  "proxy": {

    "remoteUrl": "https://cloud.r-project.org/bin/linux/ubuntu",

    "contentMaxAge": 1440,

    "metadataMaxAge": 1440

  },

  "negativeCache": {

    "enabled": true,

    "timeToLive": 1440

  },

  "httpClient": {

    "blocked": false,

    "autoBlock": true,

    "connection": {

      "retries": 0,

      "userAgentSuffix": "string",

      "timeout": 60,

      "enableCircularRedirects": false,

      "enableCookies": false

    }

  },

  "routingRule": "string",

  "apt": {

    "distribution": "focal-cran40",

    "flat": false

  }

}

EOF

curl "http://localhost:8081/service/rest/beta/repositories/apt/proxy" \

  -vvv --user admin:$(cat /opt/sonatype-work/nexus3/admin.password) \

  -H "accept: application/json" \

  -H "Content-Type: application/json" \

  -d @${REPO_JSON_PATH}

REPO_JSON_PATH="/tmp/repo.json"

cat > ${REPO_JSON_PATH} << \EOF

{

  "name": "focal-rspm-4.0-binary",

  "online": true,

  "storage": {

    "blobStoreName": "default",

    "strictContentTypeValidation": true

  },

  "cleanup": null,

  "proxy": {

    "remoteUrl": "https://packagemanager.rstudio.com/all/__linux__/focal/latest",

    "contentMaxAge": 1440,

    "metadataMaxAge": 1440

  },

  "negativeCache": {

    "enabled": true,

    "timeToLive": 1440

  },

  "httpClient": {

    "blocked": false,

    "autoBlock": true,

    "connection": {

      "retries": 0,

      "userAgentSuffix": "R/4.0.2 R (4.0.2 x86_64-pc-linux-gnu x86_64 linux-gnu)",

      "timeout": 20,

      "enableCircularRedirects": false,

      "enableCookies": true

    }

  },

  "routingRule": "string"

}

EOF

curl "http://localhost:8081/service/rest/beta/repositories/r/proxy" \

  -vvv --user admin:$(cat /opt/sonatype-work/nexus3/admin.password) \

  -H "accept: application/json" \

  -H "Content-Type: application/json" \

  -d @${REPO_JSON_PATH}

That's it. You can now exit the distro and the services will continue running and they will start with the distro when it is started thanks to systemd.

Client Apt Setup.

Using an Ubuntu distro from the app store that has an sudoer user...


# Apt proxy setup.

cat > ~/wsl-apt-proxy << \EOF

#!/bin/bash

# Executed as part of apt.conf.d/00-wsl-apt-proxy

WSL_HOST_IP=localhost

exec 9<>/dev/tcp/${WSL_HOST_IP}/3142

STATUS=$?

exec 9>&-

if [ "X$STATUS" = "X0" ]; then

  echo "http://${WSL_HOST_IP}:3142"

else

  echo "DIRECT"

fi

EOF

sudo chown root:root wsl-apt-proxy

sudo mv ~/wsl-apt-proxy /usr/sbin/

sudo chmod +x /usr/sbin/wsl-apt-proxy

echo 'Acquire::http::Proxy-Auto-Detect "/usr/sbin/wsl-apt-proxy";' | sudo tee /etc/apt/apt.conf.d/00-wsl-apt-proxy

sudo apt-add-repository "deb http://localhost:8081/repository/focal-cran-4.0-proxy focal-cran40/"

R-Client Config

there isn't much to do here... Just use whatever method you prefer to ensure you're setting your repo to fetch from nexus.


options(repos = c(nexus = "http://localhost:8081/repository/focal-rspm-4.0-binary/"))

Testing it out.

In a different terminal (or tab) start a session that will monitor the logs.


wsl -d Debian -- tail -f /var/log/apt-cacher-ng/apt-cacher.log /var/log/apt-cacher-ng/apt-cacher.err /opt/sonatype-work/nexus3/log/request.log

Ubuntu client usage testing.


sudo apt update

sudo apt -y upgrade

sudo apt -y install r-base

# An install.packages call that will cache the packages.

mkdir -p ~/R-cache-test

R --quiet -e 'options(repos = c(nexus = "http://localhost:8081/repository/focal-rspm-4.0-binary/"));

  sw.start <- Sys.time(); install.packages("usethis", lib="~/R-cache-test", quiet=TRUE); sw.end <- Sys.time(); sw.end - sw.start'

# An install.packages call that will use the cached packages.

rm -Rf ~/R-cache-test/*

R --quiet -e 'options(repos = c(nexus = "http://localhost:8081/repository/focal-rspm-4.0-binary/"));

  sw.start <- Sys.time(); install.packages("usethis", lib="~/R-cache-test", quiet=TRUE); sw.end <- Sys.time(); sw.end - sw.start'

At this point you should have seen the log monitor show that apt-cacher-ng and nexus are both working.

You should, also, have noted improved install times.

Explore the control interface and documentation for nexus to setup other proxies.

1 Like

Figured I'd drop another :+1: point for using RStudio with nexus when using containers you nuke and rebuild often... Using the Nexus raw proxy to get and cache RStudio IDE builds is very much like using a caching proxy like squid or varnish but easier (imho) to setup and use.

Pointing the proxy remote to the s3 bucket and making a lil translation script works great (20s to less than 1s)!

#/bin/bash

# Translate Latest RStudio IDE OSS Preview Release URL
URL=https://rstudio.org/download/latest/preview/desktop/bionic/rstudio-latest-amd64.deb
curl $URL -s -L -I -o /dev/null -w '%{url_effective}\n' | \
  sed 's|s://s3.amazonaws.com|://localhost:8081/repository/|'

Have fun!

3 Likes