Looping over lists with multiple elements - best approach?

I am looking to run a set of models where the results are saved for later processing. walk is perfect for this, but I am trying to figure out which of the following is "best practice" and if there is an even better approach. Below are two different approaches (warning, running this will write files to your drive).

Approach one follows the manual for pwalk (I use pwalk because more lists will be added later) and creates two different lists with the same number of elements. The beauty of this is that the code itself is easy to read and flexible with the named arguments. The problem is that once the number of elements gets large it is harder to ensure that the different parts match across the m and n lists.

Approach two creates one list where all the elements that belong together are in a vector. The upside here is that it is much easier to visual check that "like goes with like" in the list. The downside is that the code is ugly as hell and relies on the numbered part of the vector, which seems more error-prone. If the function code is long, I could name the parts first and then refer to the names but you still have to use the numbers at one point.

I lean towards Approach two because it is shorter and keeps the parts together, but is there a way to avoid using the numbering to refer to elements of the list?

suppressMessages(library(tidyverse))

# Approach one

m <- list(
  as.formula(mpg ~ factor(am)),
  as.formula(mpg ~ factor(gear))
)

n <- list(
  "name_1.rds",
  "name_2.rds"
)

list_1 <- list(model = m, name = n)
pwalk(list_1, function(model, name) {
  .m_result <- lm(model, data = mtcars)
  write_rds(.m_result, name)
})

# Approach two

list_2 <- list(
  c(as.formula(mpg ~ factor(am)),   "name_3.rds"),
  c(as.formula(mpg ~ factor(gear)), "name_4.rds")
)

walk(list_2, ~{
  .m_result <- lm(.x[[1]], data = mtcars)
  write_rds(.m_result, .x[[2]])
})

walk(list_2, ~{
  model <- .x[[1]]
  name <- .x[[2]]
  .m_result <- lm(model, data = mtcars)
  write_rds(.m_result, name)
})

Created on 2021-09-02 by the reprex package (v2.0.0)

I would opt for a combination of approach one and two .
By using tribble instead of lists you ensure that there is a correspondence between model and name.
No list manipulation because the tibble IS a list.
And if you add more lists they will simply be more columns in the tibble.

model_data <- tibble::tribble(
  ~model, ~name ,
  "mpg ~ factor(am)", "name_1.rds",
  "mpg ~ factor(gear)" , "name_2.rds"
)

purrr::pwalk(model_data, function(model, name) {
  .m_result <- lm(as.formula(model), data = mtcars)
  saveRDS(.m_result, name)
})
1 Like

@HanOostdijk Thank you so much. That works perfectly; it even allows for lists and expr as input.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.