Split columns or vector to pass to Purrr Function

have a web scrape function that I created that gets data from an API. I pass a df column I have to one of the function arguments in the web scrape function. The issue I'm having is that the URL takes up to 500 numbers in one of the parameters, and my df has 2000 rows.

How would I split the rows by 500 in order to pass the values into the function?

I've created a very basic reprex that shows the workflow of what I am looking to do. I want to pass the split df column to the parse function. I'm guessing I would need to wrap the JSON parse with map_dfr

library(tidyverse)

sample_df <- tibble(id = 1:20,
             col_2 = rnorm(1:20)) 

# parse function
parse_people <- function(ids = c("1", "10"), argument_2 = NULL){
  # Fake Base Url
  base_url <- "https://www.thisisafakeurl.com/api/people?Ids="

  # fix query parameters to collapse Ids to pass to URL
  ids<- stringr::str_c(ids, collapse = ",")

  url <- glue::glue("{base_url}{ids}")

  # Get URL
  resp <- httr::GET(url)

  # Save Response in JSON Format
  out <- httr::content(resp, as = "text", encoding = "UTF-8")

  # Read into JSON format.  
    jsonlite::fromJSON(out, simplifyDataFrame = TRUE, flatten = TRUE)

}


sample_parse <- parse_people(sample_df$id)

I'm assuming argument_2 and col_2 are red herrings, and I didn't think about doing anything about them.
Also I removed the JSON part, because it errored so I just return the a list of the chars outputs for each group

This code groups into 5's , you can alter the hardcoding, or parameterise it. I tested on an id string from 1:21 to prove that it handles fractional groups (i.e. a final group with less than 5 cases in it).

library(tidyverse)

sample_df <- tibble(id = 1:20,
                    col_2 = rnorm(1:20)) 

parse_group <- function(ids,base_url){
  # fix query parameters to collapse Ids to pass to URL
  ids<- stringr::str_c(ids, collapse = ",")
  
  url <- glue::glue("{base_url}{ids}")
  
  # Get URL
  resp <- httr::GET(url)
  
  # Save Response in JSON Format
  out <- httr::content(resp, as = "text", encoding = "UTF-8")
  
  # Read into JSON format.  
  #i removed this so as not to worry about the error it was throwing
  # jsonlite::fromJSON(out, simplifyDataFrame = TRUE, flatten = TRUE)
}
# parse function
parse_people <- function(ids = c("1", "10"), argument_2 = NULL){
  # Fake Base Url
  base_url <- "https://www.thisisafakeurl.com/api/people?Ids="
  
  #ids into groups of 5
  id_df <- enframe(ids,
                   value="id",name=NULL) 
  nrow_id_df <- nrow(id_df)
  groups_of_5 <- nrow_id_df/5L
  full_groups_of_5 <- floor(groups_of_5)
  partial_groups_of_5 <- groups_of_5-full_groups_of_5
  full_group_df <- data.frame(group=1:full_groups_of_5,
                              count=5)
  partial_group_df <- data.frame(group=full_groups_of_5+1,
                              count=partial_groups_of_5*5)
  expanded_group_df <- uncount(data = bind_rows(full_group_df,partial_group_df),
                               weights = count)
  
  id_df_with_groups <- bind_cols(id_df,expanded_group_df)
  id_df_with_groups <- group_by(id_df_with_groups,group)
  dplyr::group_map(.tbl = id_df_with_groups,
                   .f = ~parse_group(.$id,base_url))
  
}

sample_parse <- parse_people(sample_df$id)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.