Plumber Rstudio Connect and Dependent Script Files

Hi everyone. I hope you can help me with my question regarding plumber, RStudio Connect and dependent script files. I am currently developing a solution locally utilizing plumber and thinking ahead to productionized options I am looking at publishing the API to RStudio Connect. When run locally, I have multiple .R scripts (table_specs, load_functions, table_item_info etc) that are run first defining functions and generating variables that are stored in the global environment. These scripts only need to be run/loaded once and shouldn’t be reloaded every time the API endpoint is hit.

My question is, how is something like this handled by RStudio Connect? Can functions and variables be published and stored in Connect and then available to the API script? Should everything be put in a package which then gets bundled when published? How can the scripts that define the variables be run once when published but not whenever the API is accessed? Sorry is this is confusing. Thanks in advance.

This is a great question. From the Plumber docs:

"When you plumb() a file, Plumber calls source() on that file which will evaluate any top-level code that you have defined."

This means that any globally defined functions / objects are made available to subsequently defined Plumber endpoints. If you have these object definitions in a separate R script, you can source that script at the beginning of the Plumber file in order to make the objects available to endpoints.

# utils.R
add <- function(x, y) x + y
# plumber.R
library(plumber)

source("utils.R")

#* Add two values
#* @param x First value
#* @param y Second value
#* @get /sum
function(x, y) add(as.numeric(x), as.numeric(y))

In this example, the definition for the add() function is only run once, when the file is plumbed. Calls to the /sum endpoint use this function, but don't require it to be re-defined each time.

RStudio Connect will try to identify any external dependencies your API may have. For example, consider a directory with previously defined plumber.R file and utils.R files. When publishing to RStudio Connect, the option of including the utils.R file along with the plumber.R file is presented:

Once the API is published to RStudio Connect, the functions and objects declared in utils.R will only be run once for each R process associated with the API. The Runtime settings of the API can be adjusted in RStudio Connect to control how many R processes are dedicated to the API.

More detail about these settings can be found in the RStudio Connect Admin Guide.

2 Likes

Thanks Blair for the detailed answer that helps a lot. That got me thinking about a follow up about the Connect environment. Would a file published in connection with one API be available to a different hosted API. So for example, if a different API was created and published that also required utils.R to be sourced, would the second API be able to access it since it was published with the first API or would that utils.R file need to be specifically selected and published again with the second API? Thanks again.

This support article can provide some guidance here. The section on Absolute References applies here.

The main idea is that if you want a single file (in this example utils.R) to be accessed by multiple APIs, you'll need to place this file on the RStudio Connect server in a location that each API can properly access. This complicates the deployment process a little bit (utils.R won't be published with the API, and must be maintained separately) but it's certainly doable. Using the config package as suggested in the support article can help manage the difference between development and deployment environments.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.