Shiny Server "Failing" on Load Test

We have installed open source shiny server on a machine with 31 gigs of total available RAM and about 26 gigs usable after accounting for some overhead. A shiny app with a very small footprint sits on that server that can only be accessed from within a secure intranet. The app allows users to upload data and then perform various analyses on the data once uploaded. The app itself is only code and does not load any data.

We are now doing a load test to make sure the app can handle simultaneous users interacting with the app on the server. When users head over to the URL and use the app with small data files, we experience no usage problems.

However, when one user heads to the URL and uploads a large file, it prevents all other users from accessing the URL (it times out). This occurs even though memory usage on the unix machine is only hitting about 10%.

After a week of tuning memory issues (e.g., oom-killer) on the server, this issue continues and we are now at a point of not understanding why this occurs and what additional diagnostics or resolutions to apply.

Can anyone advise?

Thanks
Harold

1 Like

Hi Harold,

I'm not a Shiny expert (at all) but I had a really similar situation with a tool I made at my last job where people would upload data to have it cleaned and loaded into Salesforce.

I believe what is happening is that the community open source shiny server can only run one process per application. Due to this and R-process being single threaded, the requests get stacked and each user has to wait for the previous user's request to finish.

To my knowledge you have two options (although there are probably more!):

  1. You can upgrade to Shiny Server Pro which is able to run multiple processes per application.
  2. Depending how many users you have, you could run a separate Shiny app for each user (or team depending on frequency of use)

It would be great if someone more knowledgeable on this could confirm if the above is still true, but hopefully it at least points you in the right direction for getting this working how you need!

@scottbrenstuhl Yup, that's right. Another option would be to perform the analysis in a separate process so you don't block Shiny. You can do this with packages like future or parallel. This shiny-discuss thread has examples on how to make this work with Shiny: https://groups.google.com/forum/#!topic/shiny-discuss/jJjEtYWOMik

Shiny is also getting built in support for async sometime soon. Check out these articles to see what's on the horizon


Something like system("Rscript upload.R &", wait=FALSE) where upload.R just reads in a file into memory might work. I haven't tried this in a shiny app, but has worked for me in ETL jobs and so with proper error checking in your shiny app should work (in theory).

Some suggestions:

  • try observeEvents and action buttons in shiny to upload and check status of upload
  • use a high performance file reading library like readr or data.table instead of the base R functions. For example, fread from data.table is going to behave VASTLY better than read.csv on large files.
  • if your shiny app is pretty simple, try deploying via docker on the Linux VM. This will open up a few possible hacks like using nginx as a load balancer with the docker container ip addresses or having another lightweight app for user login and spinning up a new docker container for each user.

Thanks everyone. So, as an update. I see that the open source version isn't optimal for multiple users, so we have upgraded to Shiny Server Pro. For clarification, the readr package is used in this app for reading in the files and is remarkably fast for dealing with our huge fwf files.

Now, even with Shiny Server Pro we are experiencing the same exact issues with only one user at a time being able to work in the app. Once user 1 is done, the app loads on a second computer and then user 2 can do work.

We set up the utilization scheduler as suggested in the help (using the exact specs as found in the help documentation) but even after doing that, we still can only have one user at a time doing work in the app.

Again, we are not spiking memory at all, we have only about 5% of the memory being used by the app when the user 1 is accessing the URL and doing work.

Harold, can you take a look at: https://support.rstudio.com/hc/en-us/articles/220546267-Scaling-and-Performance-Tuning-Applications-in-Shiny-Server-Pro

You may need to change some of the utilization scheduler settings. For instance:

utilization_scheduler 1 0.9 40

Would more aggressively create R processes to serve new users. The default settings of 20 0.5 3 allow for 20 users per R process, and you would likely still see a performance bottleneck.

Feel free to reach out to your sales contact if you want to setup a call to discuss further.

The async work mentioned earlier will also go a long way in solving this exact problem.

As a follow up question, how are you "load testing" your application.

@slopp, it's human-based where multiple users simultaneously head over to the URL at scheduled times. Each is tasked with an operation in the app and we monitor diagnostics while users are doing their work.

@slopp this is very helpful. In am currently using
utilization_scheduler 1 0.5 15

and some things worked out quite well. I will try with your recommended settings also

@slopp, assuming I want to be as close to scenario B (as described in the tuning and scaling document) as possible, what is the better setting for the utilization scheduler.

Option 1
utilization_scheduler 1 0.5 15

Option 2
utilization_scheduler 1 0.9 40

Maybe a brief description of my app would be helpful also. I have an app that loads very quickly since it loads only code. The app allows a user to read in a very large file and do some work on it, and this can take up to 20 - 30 minutes.

So, ideally, I want each user to access the URL we created where the app lives and essentially have their own R process and session and in no way collide with work any other user might be doing. We previously had this app set up where it ran locally on each person's machine, so there was never any competing for resources between different users.

I would like to come as close as possible to that type of usage as possible if that makes sense.

Thank you for your advice
Harold

Option 2 makes sense then. It won't aggressively start up processes pre-emptively, but any new user that arrives will get his/her own process on demand.