Capture list names as variable in df

cawthm · February 4, 2018, 9:16pm

I'm using the map family in purrr; and I'm having a hard time figuring how encode a named list item into a variable (to use as a grouping var in ggplot) within a dataframe.

In fact, I'm wondering if I'm even thinking about the data structures right: should i be relying on nested dfs instead of a list?

In any case, here's what i'm trying:

library(tidyverse)

# function that returns a df with vars x and y
toy_fn <- function(n) { tibble(x = seq(1,n), y = n*1.1) }

# some inputs for my toy function
my_n <- c(10,15,30,50)

x <- my_n %>% map(.f = toy_fn) %>% set_names(paste0("n_",my_n))
# x is a named list of four dfs that I'd like to plot with ggplot2::facet_grid
# and it's here where I'm stuck
# I need to 1) collapse the list into a single df and 2) capture a unique preserving variable to which to map the facet grid ("df$name")


# I'd expect a a final tibble df with vars x, y, and a created "name" var, which i'd pass to ggplot like so:

#ggplot(df) + geom_line(aes(x,y)) + facet_wrap(~name)

Thanks for any help with the mapping

markdly · February 5, 2018, 2:39am

How about using dplyr::bind_rows to collapse into a single df?

library(tidyverse)
x %>% bind_rows(.id = "name")

#> # A tibble: 105 x 3
#>     name     x     y
#>    <chr> <int> <dbl>
#>  1  n_10     1    11
#>  2  n_10     2    11
#>  3  n_10     3    11
#>  4  n_10     4    11
#>  5  n_10     5    11
#>  6  n_10     6    11
#>  7  n_10     7    11
#>  8  n_10     8    11
#>  9  n_10     9    11
#> 10  n_10    10    11
#> # ... with 95 more rows

cawthm · February 5, 2018, 3:52am

Thank you much; I’d noticed that bind_rows nicely condensed a list of dfs, but I was unaware of the .id argument, which I see is likewise available to many of the map_* like functions.

jennybryan · February 5, 2018, 8:20am

Here's a slightly more compact version, taking advantage of map_dfr(), which calls bind_rows() for you.

library(tidyverse)

toy_fn <- function(n) tibble(x = seq_len(n), y = n * 1.1)
my_n <- c(2, 3, 4)
my_n <- set_names(my_n, paste0("n_", my_n))
map_dfr(my_n, toy_fn, .id = "hi")
#> # A tibble: 9 x 3
#>   hi        x     y
#>   <chr> <int> <dbl>
#> 1 n_2       1  2.20
#> 2 n_2       2  2.20
#> 3 n_3       1  3.30
#> 4 n_3       2  3.30
#> 5 n_3       3  3.30
#> 6 n_4       1  4.40
#> 7 n_4       2  4.40
#> 8 n_4       3  4.40
#> 9 n_4       4  4.40

Created on 2018-02-05 by the reprex package (v0.2.0).

markdly · February 5, 2018, 10:12pm

Thanks for this example! I for one have been using map_df in my code quite regularly instead of map_dfr:

library(purrr)
identical(map_df, map_dfr)
#> [1] TRUE

But I'll use map_dfr from now on given this now seems to be the preferred convention