Slow .libPaths() refresh under web based RStudio

rstudio-server

#1

Dear RStudio/R experts..
Trust all is well at your end. I would highly appreciate it if you could provide insight into differences in ".libPaths()" refresh behavior under browser and shell environment. Check for relevant information on the net didn't help and hence the recourse to your expertise.

We use web-based RStudio on an AWS EC2 instance; Google Chrome is used as the browser. On web based RStudio console,

a) Upon login, typing .libPaths() on the console elicits almost instantaneous response (~ < 15 seconds during peak time). Default path in .libPaths() points to an individual secure "HOME" location on companys network; e.g. <Company_Network/home// x86_64-pc-linux-gnu-library/<version#> - a default configuration in an R login session.

b) Upon updating .libPaths to .libPaths("/data/R/lib"), and response to .libPaths() on the console is quick, taking <1 second to a few seconds (<10 seconds peak time); "/data"; is an ephemeral storage collocated with the EC2 server instance.

c) Upon updating .libPaths to .libPaths("<Company_network_Filer>;/R/lib"), and response to .libPaths() on the console takes a few minutes (2 to 5 minutes any time); "<Company_network_Filer>/R" is another R library location in a non-cloud (traditional) Company's network file share storage and outside of individual's secure environment.

As noticed above, while .libPaths() refresh times are highly variable under web based RStudio, so are responses to consoled based R commands/steps/functions.. e.g. response to dim(data_frame) or head(data_frame) at times take more than 20 seconds.

In contrast, performing all the above on "Shell" command line .. elicits consistent almost instantaneous responses whether the shell is under "RStudio" shell or in R under classical Linux shell.

Looking forward to your insight and help.


#2

Thanks for your post and apologies for the delayed response.

Unfortunately, I do not have any immediate thoughts / feedback, but perhaps others will! Am I correct in summarizing that in RStudio:

  • Performance for co-located EC2 resources is good, but varies with system load
  • Performance on the network share is bad

When you contrast performing the above on the "Shell" command line, you are talking about using the Terminal within RStudio and also through an SSH session, correct? Are these commands affected by the network share, as well? I.e. if you start R in a terminal, does writing objects to the network share slow things down as well?

Do you mind explaining a bit more about the network share? Is that over NFS?

One random thought I have - the terminal within RStudio is implemented via a websocket (I think), so perhaps there are differences from a performance perspective under load versus firing requests through the R console. I am definitely not the most experienced person to comment on such a difference though :slight_smile:


closed #3

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.