How to make thousands of API calls responsibly?

I'm calling an API several thousand times and want to (1) do it responsibly; and (2) log success/failures due to rate limiting, bad requests, etc. I've tinkered with some combination of try-catch, for loops + Sys.sleep, purrr's insistently, slowly, possibly adverbs, but I'm not sure how to make the calls, prevent rate-limiting, and account for errors (either by trying again via insistently or skipping).

Here are skeletons of my two initial setups:

Skeleton 1

# calls is an object to make the calls
responses <- vector("list", length = length(calls))
for (i in seq_along(calls)) {
  message("...")
  r <- try(
    ### API call here
    )
  )
  responses[[i]] <- ifelse(inherits(r, "try-error"), NULL, r)
  Sys.sleep(8)
}

Skeleton 2

library(purrr)
rate <- rate_backoff(pause_base = 5)
get_responses_insistently <- insistently(API_FUNC_CALL, rate = rate)
get_responses_insistently_but_possibly <- possibly(get_responses_insistently, otherwise = NULL)
responses <- map(calls, get_responses_insistently_but_possibly)

Is there a better approach?

So, I'd go with the first one, because if it breaks, you can start where you left off easily with i and it's easier to not cram complex functions into apply statements (+ I really don't like purrr)

What I've done with rate limiting is just use a sys.sleep - consult API docs for how many calls per minute/hour/day you're allowed and base your timeout on that.

Then, based on the status of the reply, you can add an extra timeout. Here's the outline of stuff I've done for Azure translation

require(roperators) # for paste via %+%
require(jsonlite)
require(curl)


df1$Comm1 %<>%
    tm::stripWhitespace() %>% 
    tm::removePunctuation()

res <- list()
for(i in seq_along(rownames(df1)){
  try({
    this_text <- df1$Comm1[i]
    out <- system('curl -X POST \"https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&to=en\" -H \"Ocp-Apim-Subscription-Key: xxxxxxxxxx\" -H \"Content-Type: application/json\" -d \"[{\'Text\':\'' %+%
                    this_text %+% '\'}]\"',
                  intern = TRUE)
    out <- fromJSON(out)
    out$respid <- df1$respid[i]
    res[[i]] <- out
  })
}

saveRDS(res, file = 'res1.RDS')

You can put any handlers you like based on what the API you're calling returns. Also, I ran this on OSX, so a system call to curl worked, on windows you may need to use the curl package (or call R from Windows subsystem for linux).

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.