Switching plumber serialization type based on URL arguments

I have been trying to figure out how to switch the output format of a plumber API endpoint based on a user argument. Eventually I hit on the solution below, making a low-level custom serializer that switches the serialization function and type header based on an object attribute. Since I didn't find any examples elsewhere, posting it here in case it is useful or in case anyone has suggestions for improvements:

library(plumber)

#* @apiTitle Example Plumber API with Serializer to switch formats using object
#* attributes

serializer_switch <- function() {
  function(val, req, res, errorHandler) {
    tryCatch({
      format <- attr(val, "serialize_format")
      if (is.null(format) || format  == "json") {
        type <- "application/json"
        sfn <- jsonlite::toJSON
      } else if (format == "csv") {
        type <- "text/csv; charset=UTF-8"
        sfn <- readr::format_csv
      } else if (format == "rds") {
        type <- "application/rds"
        sfn <- function(x) base::serialize(x, NULL)
      }
      val <- sfn(val)
      res$setHeader("Content-Type", type)
      res$body <- val
      res$toResponse()
    }, error = function(err) {
      errorHandler(req, res, err)
    })
  }
}

register_serializer("switch", serializer_switch)

#* Return a data frame of random values
#* @param n size of data frame
#* @param format one of "json", "csv", or "rds"
#* @serializer switch
#* @get /random_df
function(n = 10, format = "json") {
  out <- data.frame(value = rnorm(n))
  attr(out, "serialize_format") <- format
  out
}
1 Like

You could also use three routes

library(plumber)

#* @apiTitle Example Plumber API with Serializer to switch formats using object
#* attributes

handler <- function(n = 10) {
  data.frame(value = rnorm(n))
}

#* Return a data frame of random values
#* @param n:int* size of data frame
#* @serializer json
#* @get /random_df/json
handler

#* Return a data frame of random values
#* @param n:int* size of data frame
#* @serializer rds
#* @get /random_df/rds
handler

#* Return a data frame of random values
#* @param n:int* size of data frame
#* @serializer csv
#* @get /random_df/csv
handler

programmatic use

library(plumber)
handler <- function(n = 10) {
  data.frame(value = rnorm(n))
}
pr() %>%
  pr_get("/random_df/csv", handler, serializer = serializer_csv()) %>%
  pr_get("/random_df/json", handler, serializer = serializer_json()) %>%
  pr_get("/random_df/rds", handler, serializer = serializer_rds()) %>%
  pr_run()

Yes, but that's a lot of code to repeat if I want many endpoints in my API to have the option of multiple formats. I'm new to the programmatic use approach, though, that's good to know!

This ?

library(plumber)

#* @apiTitle Example Plumber API with Serializer to switch formats using object
#* attributes

serializers <- list(
  "json" = serializer_json(),
  "csv" = serializer_csv(),
  "rds" = serializer_rds()
)

#* Return a data frame of random values
#* @param n size of data frame
#* @param format one of "json", "csv", or "rds"
#* @get /random_df
function(n = 10, format = "json", res) {
  res$serializer <- serializers[[format]]
  data.frame(value = rnorm(n))
}
6 Likes

That's an elegantly simple solution!

1 Like

Ok. There's more setup, but I think it has legs for adding to plumber.

This approach is modeled after as_attachment() where it only adjusts the return value and not the res object.

Additions:

  • Register a "dynamic" serializer that will create the serializer on the fly
    • The serializers are being created each time, which is slower than @meztez's solution above, but I don't think it's noticeable
    • We could add more checks such as "type not allowed" by passing in allowed types to #' @serializer dynamic list(allowed = c("json", "csv", "rds")) to help prevent trying to serializer to a bad type.
  • Define a function (dynamic_ser()) that can return a value that is understood by the "dynamic" serializer
library(plumber)

# Routes with `dynamic` serializer should return the result of this function
dynamic_ser <- function(value, type, ...) {
  structure(
    class = "serializer_dynamic_payload",
    list(
      value = value,
      type = type,
      args = list(...)
    )
  )
}
register_serializer(
  "dynamic",
  function(...) {
    ellipsis::check_dots_empty()

    function(val, req, res, errorHandler) {
      if (!inherits(val, "serializer_dynamic_payload")) {
        stop("Value returned from route did not return the result of `dynamic_ser(value, type)`")
      }

      # extract info
      value <- val$value
      type <- val$type
      args <- val$args

      # If this was submitted as a PR, this value is available within the plumber package
      ser_factory <- plumber:::.globals$serializers[[type]]
      if (!is.function(ser_factory)) {
        stop("Dynamic serializer of type ", type, " not found. See `registered_serializers()` for known types.")
      }

      # generate serializer
      ser <- do.call(ser_factory, args)

      # serialize
      ser(value, req, res, errorHandler)
    }
  }
)
#./plumber.R

#* @apiTitle Example Plumber API with dynamic serializer

#* Return a data frame of random values
#* @param n size of data frame
#* @param format one of "json", "csv", or "rds"
#* @get /random_df
#* @serializer dynamic
function(n = 10, format = c("json", "csv", "rds"), res) {

  format <- match.arg(format)
  value <- data.frame(value = rnorm(n))

  switch(format,
    "json" = {
      # able to pass through arguments
      dynamic_ser(value, "json", auto_unbox = TRUE)
    },
    # default case
    {
      dynamic_ser(
        # also download as a file!
        as_attachment(value, filename = paste0("random_data.", format)),
        format
      )
    }
  )
}

Thoughts on this approach? Or should we promote using the serializer dictionary?

The serializer dictionary approach could also have extra values in the list() to support serializers with different arguments. Ex: json1 has auto_unbox = TRUE and json2 has auto_unbox = FALSE. So there is no functional advantage of either approach. The serializer dictionary approach also works with returning as_attachment(value)` objects. You as the author would have to know which ones to prep.

What about allowing to register multiple serializers ,like parsers.

#* Return a data frame of random values
#* @param n size of data frame
#* @get /random_df
#* @serializer csv
#* @serializer json
#* @serializer rds
handler

Default behaviour would be to use the first one in the dict but we could use accept header like github does to select a different serializer. Plus it would not require a lot of code modification I believe. Would probably have to fiddle a bit with open api/spec to facilitate headers handling.

I like the header idea. Less work by the user. Everything stays very familiar. With the serializers defined at the top, they could be pre-processed too.

This seems like a better final solution. Nice!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.