Using linux Lmod with R and RStudio

Hello R-Admin community,

I am posting here seeking some experience about what it seems to be a pretty advance linux stuff for R community. I did not found much info on this for R users and admin.

In our server environment, we use a tool called Environment Module so that a user can modify its current bash environment to use some specific scientific tools. We use specificaly Lmod implementation (https://lmod.readthedocs.io/en/latest/) . We use module for creating environment for tools like Latex, Python, AMPL, XPRESS, or even base R with special .libPaths().

Integration with module is now supported in RStudio Server Pro, to load some module with specific R version. See https://docs.rstudio.com/ide/server-pro/r-versions.html#extended-r-version-definitions
This is useful, but this means to have a predefined set of R versions, using some modules, and that the Lmod tool is available to R ecosystem through the use of RStudio Product.

I am still trying to define what are the possible stategy regarding Modules and R integration.

I am interested to know if anyone have experience with any of this.

Can we use Lmod modules, from R to modify the R Sessions ?
From my tests, it does not seem possible. Calling system("module load my_module") from inside a R process, does not actualized the environment of R. Not sure it is even possible.

one way currently to achieve this is to load module in ~/.bash_rc to modify env just before R is launched. Even in Rprofile it seems too late.

Does anyone know a tool to help deals with modules in R ? Is this even possible ?
before digging in some tricky way, or creating a new tool, I would like to know what anyone thinks.

Regarding RStudio Server Pro Support, a new interesting feature would be to be able for a user as project options to configure which modules the user wants to load for a project. I don't know where is the best place to drop such Feature Request idea.

Thanks all - anyone which have any clue or experience regarding Modules (like Lmod) and R are more than welcome to chime in !

You are able to do this now.

if you add /etc/rstudio/rsession-run, you can add env varriables and scripts for EACH rstudio session:
#!/bin/bash
LD_LIBRARY_PATH=$R_HOME/lib:$LD_LIBRARY_PATH
PATH=$R_HOME/bin:$PATH
MANPATH=$R_HOME/share:MANPATH TMPDIR=/local1/tmp export MANPATH PATH LD_LIBRARY_PATH TMPDIR source /home/biotools/tex/current/PKG_PROFILE exec @

Alternatively, you can use the new r-versions and have MULTIPLE
profiles and environments.

Path: /home/biotools/r/R-3.6.1
Label: Current Production Version
Script: /etc/rstudio/rsession-run.sh

It is possible. However, " Calling system("module load my_module") from inside a R process" as you do simply sets the variables in the subprocess spawned by the system call, which goes away when that subprocess ends and returns to R.

You want to change the environment of the process R is running in.

Sys.setenv(myvar="myval") can be used to set a single environment variable. A step in the direction, but not enough.

readRenviron lets you "Set Environment Variables from a File"

So, If you had an process to build such a file "on the fly" from your module, you could save its output as a temp file, load the temp file, and delete it.

Here is a process that works from the command-line for the module named "cuda"

mymod=cuda sh -c 'eval modulecmd bash load ${mymod} ; env'

So, build strings like that in R, and pass them to system (piping their output to a temp file).

Or, probably better, use Rs system2 command to pass the value of mymod in the environment and specify where stdout should go. And tempfile to get a safe place to put that stdout.

Then load that temp file with readRenviron ... and don't forget to delete it.

But, why do you really want to do this? And do you really want to do it in the middle of an ongoing R session? I suppose if you were in the middle of a long running R session with lots of state and you want to use one of your environment module programs and not lose the state of your session would be a good idea.

And, if you're wanting to do all this under RStudio Server and are not the admin of the server, then you need to get some cookies for the admin to contrive for environment modules to even be available. The cookies might have to be special if you're using the open source or "Prof" version of RStudio Server.

Might this work on Windows? No idea.

Let us know if you put the pieces together.

I don't see any other way of doing it.

1 Like

Thanks for all this useful informations.

I am the admin of our clusters and we use linux Lmod modules to help users deals with different environments (Python, R, optimisations tool, other tool, ...). This is use heavily in our slurm clusters too.

For our RStudio Servers Pro cluster, R users can't easily use module to load the correct env values, for example to use Optimisation tools (like AMPL, or XPRESS). It would be insteresting to be able to do so, and by projects.
Currently, we can load modules through RStudio Pro support but it is by R versions not by project.
Or a user has to modify its .bashrc to load the wanted modules - .bashrc it run before R is launched so it works. But it is not by project. Every R session will have the modules.

A tool to get the result of module load then load them into R is really interesting. I'll will look into that. Thank you !