Loop with map_df in batches/chunks of 2 vectors (not all at once)?

purrr

#1

I’m trying to map a function to obtain album data from Spotify’s API for a data frame of album titles and respective artists, but it seems that my loop works better when I limit the number of albums the function searches for to 2.

Is there a way – perhaps something in purrr– to automate my loop to only search for album data in batches/chunks of 2 (or any number)?

Here’s an example:

library(dplyr)
library(spotifyr)
library(purrr)

# from Spotify's developer page
Sys.setenv(SPOTIFY_CLIENT_ID = "xxx")
Sys.setenv(SPOTIFY_CLIENT_SECRET = "xxx")
access_token <- get_spotify_access_token()

Artist <- c("Spiritualized", "Fleet Foxes", "The Avalanches", "Ween")
Album <- c("Sweet Heart, Sweet Light", "Helplessness Blues", "Wildflower", "The Mollusk")

mydata <- data_frame(Artist, Album)

get_album_data <- function(x) {
  get_artist_audio_features(mydata[["Artist"]][x], return_closest_artist = TRUE) %>% 
    filter(agrepl(mydata[["Album"]][x], album_name, ignore.case = TRUE)) %>%
    mutate(mydata[["Artist"]][x])
}

# will work here but is inconsistent
album_data <- map_df(seq(1, 4), get_album_data)

# what I want to be done automatically
album_data_1 <- map_df(seq(1, 2), get_album_data)
album_data_2 <- map_df(seq(3, 4), get_album_data)

Created on 2018-02-21 by the reprex package (v0.2.0).


#2

Are you sure that your inconsistency doesn’t come from the way Spotify limits crawling? What happens if you add timeout to your get_album_data (e.g, Sys.sleep(10) for 10 seconds)?

To do what you are asking, you can add additional mapping, but I’m not sure it’ll help, to be honest:

> purrr::map(c(1, 2, 3, 4, 5), function(x) {c(2*x -1, 2*x)})
[[1]]
[1] 1 2

[[2]]
[1] 3 4

[[3]]
[1] 5 6

[[4]]
[1] 7 8

[[5]]
[1]  9 10

#3

Great idea, Misha. I’ll try that. Is that something I can just add to the end of my function?

ex:

get_album_data <- function(x) {
  get_artist_audio_features(fantano$Artist[x], return_closest_artist = TRUE) %>% 
    filter(agrepl(fantano$Album[x], album_name, ignore.case = TRUE)) %>%
    mutate(fantano$Artist[x])

    Sys.sleep(10)
}

I really do appreciate you answering all of my questions, by the way. Super helpful for me


#4

Actually seems like it only works when I place it in the beginning of my function


#5

Yes, this is what I meant. You can try with lower sleep times if that works.