Keep common variables in multiple lists and bindrows

Hello,

Thanks everyone in advance for the help even if this seems very basic!

I have a function that uploads a series of surveys from the survey software Qualtrics. The package qualtRics uses an API to get survey data directly to R.

I created a function that uploads multiple surveys using the map function. So each survey is then stored as its own list. What I want to do is bind rows on all lists since they have some common variables, but I do not know how to manipulate each list to keep common variable names. I am assuming it is another map function using pluck but can't figure exactly what to do.

Thank you for the help!

dplyr::bind_rows is smart when it comes to varying variables.

3 Likes

It is difficult to really help you without a small example of what you want to do, what you tried and what you want to get as result.
Can you help us help you by providing a reprex ?

Here one dummy example from what I understood from your question.

library(dplyr)
#> 
#> Attachement du package : 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
dummylist <- list(
  tab1 =tibble(col1 = 1:10,
             col2 = sample(LETTERS, 10)),
  tab2 = tibble(col1 = 1:10,
              col2 = sample(LETTERS, 10))
)


dummylist %>%
  bind_rows(.id = "tab_num")
#> # A tibble: 20 x 3
#>    tab_num  col1 col2 
#>    <chr>   <int> <chr>
#>  1 tab1        1 J    
#>  2 tab1        2 P    
#>  3 tab1        3 D    
#>  4 tab1        4 I    
#>  5 tab1        5 B    
#>  6 tab1        6 X    
#>  7 tab1        7 G    
#>  8 tab1        8 Y    
#>  9 tab1        9 Z    
#> 10 tab1       10 L    
#> 11 tab2        1 A    
#> 12 tab2        2 H    
#> 13 tab2        3 S    
#> 14 tab2        4 D    
#> 15 tab2        5 U    
#> 16 tab2        6 O    
#> 17 tab2        7 K    
#> 18 tab2        8 R    
#> 19 tab2        9 X    
#> 20 tab2       10 F

you could also use directly purrr::map_df to directly get a tibble output

# dummy example with identity function
dummylist %>%
  purrr::map_df(identity, .id = "tab_num")
#> # A tibble: 20 x 3
#>    tab_num  col1 col2 
#>    <chr>   <int> <chr>
#>  1 tab1        1 J    
#>  2 tab1        2 P    
#>  3 tab1        3 D    
#>  4 tab1        4 I    
#>  5 tab1        5 B    
#>  6 tab1        6 X    
#>  7 tab1        7 G    
#>  8 tab1        8 Y    
#>  9 tab1        9 Z    
#> 10 tab1       10 L    
#> 11 tab2        1 A    
#> 12 tab2        2 H    
#> 13 tab2        3 S    
#> 14 tab2        4 D    
#> 15 tab2        5 U    
#> 16 tab2        6 O    
#> 17 tab2        7 K    
#> 18 tab2        8 R    
#> 19 tab2        9 X    
#> 20 tab2       10 F

Created on 2018-02-02 by the reprex package (v0.1.1.9000).

3 Likes

Thank you, @alistaire. I realized that dplyr::bind_rows only keeps similar variables. So I just used this with purrr::reduce. Very simple!

Thanks @cderv for the response! I tried purrr:map_df but I don't think it binds rows in a "smart" way like dplyr::bind_rows does. When i used purrr:reduce and used function dplyr::bind_rows this seemed to do what I wanted.

If you look at the code source of map_df you see it uses bind_rows


so it is as smart :slight_smile:

However in your case, in seems you need a reducing step on your list before binding, something that map_df does not provide.

Thanks for pointing that out. I am thinking that map_dfr and purrr::reduce using function bind_rows should be the exact same then?