How to combine lists into a new data frame

I have few lists and each list has has similar data frames with same file names, but different values. I am not sure how to combine and merge row-wise and column wise the files within different list. I was not able to create reprex sample for my lists. But below is a list of samples for 3 lists.

I need to be able to combine the files with same names in all three lists into a new data frame DF.

Example, df1 from all lists should be combined row-wise. And similarly df2 from all lists.

## Each List consists of 2 data frames as examples shown below:
# List1
df1 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class A", "Class A", "Class A"),
            Builds = c(376, 393, 524)
 )

df2 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class B", "Class B", "Class B"),
            Builds = c(300, 400, 500)
 )

list_1 <- list(df1, df2)

# List 2 
df1 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Apr", "1996 May", "1996 Jun"),
             Class = c("Class A", "Class A", "Class A"),
            Builds = c(525, 544, 516)
 )

df2 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Apr", "1996 May", "1996 Jun"),
             Class = c("Class B", "Class B", "Class B"),
            Builds = c(301, 405, 509)
 )

list_2 <- list(df1, df2)

# List3
df1 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jul", "1996 Aug", "1996 Sep"),
             Class = c("Class A", "Class A", "Class A"),
            Builds = c(428, 451, 484)
 )

df2 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jul", "1996 Aug", "1996 Sep"),
             Class = c("Class B", "Class B", "Class B"),
            Builds = c(200, 300, 400)
 )

list_3 <- list(df1, df2)
I would also like to know how to combine these lists column wise
## Each List consists of 2 data frames as examples
# List1
df1 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class A", "Class A", "Class A"),
            Builds = c(376, 393, 524)
 )

df2 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class B", "Class B", "Class B"),
            Builds = c(300, 400, 500)
 )

list_4 <- list(df1, df2)

# List 2 
df1 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class A", "Class A", "Class A"),
            Sales = c(525, 544, 516)
 )

df2 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class B", "Class B", "Class B"),
            Sales = c(301, 405, 509)
 )

list_5 <- list(df1, df2)

Thanks for your help in advance!

Hi,

Combining data frames in lists is very easy. All you need is the bind_rows or bind_cols function from the dplyr package. There are base R functions rbind and cbind as well, but I feel they don't always perform as well, especially if the order of the columns is not the same in the data frames.

So all you need is this:

library(dplyr)

bind_rows(list_1, list_2, list_3)
bind_cols(list_1, list_2, list_3)

Note that for the bind_cols, new column names are created because they all are the same.

Hope this helps,
PJ

I understand you want element-wise row or column binding. Is this what you want to achieve?

JW


library(purrr)
library(dplyr)
mylist  <- map2(list_2,list_3, bind_rows)
# [[1]]
#       date   Class Builds
# 1 1996 Apr Class A    525
# 2 1996 May Class A    544
# 3 1996 Jun Class A    516
# 4 1996 Jul Class A    428
# 5 1996 Aug Class A    451
# 6 1996 Sep Class A    484
# 
# [[2]]
#       date   Class Builds
# 1 1996 Apr Class B    301
# 2 1996 May Class B    405
# 3 1996 Jun Class B    509
# 4 1996 Jul Class B    200
# 5 1996 Aug Class B    300
# 6 1996 Sep Class B    400
#

Thanks @Jwvz001! Yes, this is the output I am looking for row-wise. But at my end, this doesn't work and gives the following error when tried on 3 lists

mylist  <- map2(list_1,list_2,list_3, bind_rows) 

Error:

Also, for column wise, when I try to join the common columns from lists, one data frame at a time, it works. But when try to write a function that can work on all dataframes at once, then I am doing something wrong with my function creation

#Works considering individual data frame
df1_var <- list_4[[1]] %>%
  inner_join(list_5[[1]]) %>%
  inner_join(list_6[[1]])  # I didn't include this list earlier. So, for this example I created duplicate of list_5 and is provided at the bottom

# Doesn't work with function to handle all data frames. What am I doing wrong here?
vars <- function(x, y, z){
  x <- x%>%
  inner_join(y) %>%
  inner_join(z) 
  
  return(x)
}

vars_all <- map(list_4, list_5, list_6, ~vars(.))
# I get the same error here as seen with screenshot above with your code example: Error: Index1  must have length 1, not 3

Example of List_6

# List_6
df1 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class A", "Class A", "Class A"),
            Sales = c(500, 520, 510)
 )

df2 <- data.frame(
  stringsAsFactors = FALSE,
              date = c("1996 Jan", "1996 Feb", "1996 Mar"),
             Class = c("Class B", "Class B", "Class B"),
            Sales = c(201, 305, 409)
 )

list_6 <- list(df1, df2)

Thanks @pieterjanvc! bind_rows and bind_cols doesn't seem to be working for all data frames at once from different lists.

hi,

Sorry, map2 only takes 2 dataframes as arguments. This throughs the error. With 3 or more, you need the pmap-function. See purrr-documentation.

Good luck, JW

Yes, I figured that and I tried too it worked if I did 2 at a time. I will look into pmap. Hopefully, that can reduce the repetition. Also, last concern, after getting all data frames the way we want, I wanted to bind them by rows. I have used this before with no issues. But I am getting errors this time with that too.

vars <- map2(list1, list2, inner_join)

vars <- bind_rows(vars) 

Error:

Thanks again for your help!

Is there a typo or are ' list1' and 'list2' different from 'list_1' and 'list_2' ? It works, albeit that it may return an empty tibble from the join, as there is no communality.

Hi,

It does seem to work for me though, unless I'm not understanding what you like to do...

library(dplyr)

## Each List consists of 2 data frames as examples shown below:
# List1
df1 <- data.frame(
  stringsAsFactors = FALSE,
  date = c("1996 Jan", "1996 Feb", "1996 Mar"),
  Class = c("Class A", "Class A", "Class A"),
  Builds = c(376, 393, 524)
)

df2 <- data.frame(
  stringsAsFactors = FALSE,
  date = c("1996 Jan", "1996 Feb", "1996 Mar"),
  Class = c("Class B", "Class B", "Class B"),
  Builds = c(300, 400, 500)
)

list_1 <- list(df1, df2)

# List 2 
df1 <- data.frame(
  stringsAsFactors = FALSE,
  date = c("1996 Apr", "1996 May", "1996 Jun"),
  Class = c("Class A", "Class A", "Class A"),
  Builds = c(525, 544, 516)
)

df2 <- data.frame(
  stringsAsFactors = FALSE,
  date = c("1996 Apr", "1996 May", "1996 Jun"),
  Class = c("Class B", "Class B", "Class B"),
  Builds = c(301, 405, 509)
)

list_2 <- list(df1, df2)

# List3
df1 <- data.frame(
  stringsAsFactors = FALSE,
  date = c("1996 Jul", "1996 Aug", "1996 Sep"),
  Class = c("Class A", "Class A", "Class A"),
  Builds = c(428, 451, 484)
)

df2 <- data.frame(
  stringsAsFactors = FALSE,
  date = c("1996 Jul", "1996 Aug", "1996 Sep"),
  Class = c("Class B", "Class B", "Class B"),
  Builds = c(200, 300, 400)
)

list_3 <- list(df1, df2)

bind_rows(list_1, list_2, list_3)
#>        date   Class Builds
#> 1  1996 Jan Class A    376
#> 2  1996 Feb Class A    393
#> 3  1996 Mar Class A    524
#> 4  1996 Jan Class B    300
#> 5  1996 Feb Class B    400
#> 6  1996 Mar Class B    500
#> 7  1996 Apr Class A    525
#> 8  1996 May Class A    544
#> 9  1996 Jun Class A    516
#> 10 1996 Apr Class B    301
#> 11 1996 May Class B    405
#> 12 1996 Jun Class B    509
#> 13 1996 Jul Class A    428
#> 14 1996 Aug Class A    451
#> 15 1996 Sep Class A    484
#> 16 1996 Jul Class B    200
#> 17 1996 Aug Class B    300
#> 18 1996 Sep Class B    400

Created on 2021-02-09 by the reprex package (v1.0.0)