I have a heterogeneous list of vectors that I want to convert to a two column data frame. I have found a few solutions, but they are all quite complex. And I’m having trouble searching for ideas because I only find
solutions for lists of vectors of the same length. Is there a simpler way to accomplish this task?
Here is a minimal example of the starting list. The vectors vary in length and can also empty vectors or even NULL
.
input <- list(A = letters[1:3], B = letters[3:4], C = NULL, D = character(0))
input
## $A
## [1] "a" "b" "c"
##
## $B
## [1] "c" "d"
##
## $C
## NULL
##
## $D
## character(0)
And here is my desired output data frame. Each row corresponds to one of the elements of the vectors in the list of vectors, i.e. the first column is the name of the list element and the second column is the element of the vector. List elements with no data (e.g. NULL
or character(0)
) are omitted:
output <- data.frame(name = c(rep("A", length(input$A)), rep("B", length(input$B))),
item = c(input$A, input$B), stringsAsFactors = FALSE)
output
## name item
## 1 A a
## 2 A b
## 3 A c
## 4 B c
## 5 B d
I tried unlist()
, which properly omits the empty list elements. But unfortunately it appends numbers to the names, which would require writing a fragile regex to remove them (e.g. what if the names of the list elements ended in numbers?).
list2df_unlist <- function(x) {
tmp <- unlist(x)
data.frame(name = names(tmp), item = tmp, stringsAsFactors = FALSE)
}
list2df_unlist(input)
## name item
## A1 A1 a
## A2 A2 b
## A3 A3 c
## B1 B1 c
## B2 B2 d
My solution using base R used mapply()
+ do.call()
and also required a separate helper function to properly filter the empty list elements.
list2df_mapply <- function(x) {
list_to_df <- function(name, vec) {
if (is.null(vec) || length(vec) == 0) return(NULL)
data.frame(name = name, item = vec, stringsAsFactors = FALSE)
}
tmp <- mapply(list_to_df, as.list(names(x)), x)
do.call(rbind, tmp)
}
list2df_mapply(input)
## name item
## 1 A a
## 2 A b
## 3 A c
## 4 B c
## 5 B d
My solution with purrr is simpler by replacing mapply()
+ do.call()
with a single call to map2_dfr()
, but it still required the helper function.
list2df_purrr <- function(x) {
list_to_df <- function(name, vec) {
if (is.null(vec) || length(vec) == 0) return(NULL)
data.frame(name = name, item = vec, stringsAsFactors = FALSE)
}
purrr::map2_dfr(names(input), input, list_to_df)
}
list2df_purrr(input)
## name item
## 1 A a
## 2 A b
## 3 A c
## 4 B c
## 5 B d
I also explored purrr::imap_dfr()
, but couldn’t get it to work. Any ideas on how to make this transformation code more readable? Thanks!