How to remove 'empty' nested lists with purrr?

I have a list column with nested lists and want to remove those lists which contain only NA.
I saw this and this, but it seems I am missing something crucial. Many thx.

library(tidyverse)

x <- list(list(c("A", "B"), NA, NA, NA), 
          list(c("C","E"), NA, "D", NA, NA))

intended.answer <- list(list(c("A", "B")), 
          list(c("C","E"),"D"))

x %>% map_depth(.,2, has_element, NA)
#> [[1]]
#> [[1]][[1]]
#> [1] FALSE
#> 
#> [[1]][[2]]
#> [1] TRUE
#> 
#> [[1]][[3]]
#> [1] TRUE
#> 
#> [[1]][[4]]
#> [1] TRUE
#> 
#> 
#> [[2]]
#> [[2]][[1]]
#> [1] FALSE
#> 
#> [[2]][[2]]
#> [1] TRUE
#> 
#> [[2]][[3]]
#> [1] FALSE
#> 
#> [[2]][[4]]
#> [1] TRUE
#> 
#> [[2]][[5]]
#> [1] TRUE

x %>% discard(., has_element(.,NA))
#> Error in probe(.x, .p, ...): length(.p) == length(.x) ist nicht TRUE

x %>% compact(.) %>% purrr::discard(.,is.na(.))
#> [[1]]
#> [[1]][[1]]
#> [1] "A" "B"
#> 
#> [[1]][[2]]
#> [1] NA
#> 
#> [[1]][[3]]
#> [1] NA
#> 
#> [[1]][[4]]
#> [1] NA
#> 
#> 
#> [[2]]
#> [[2]][[1]]
#> [1] "C" "E"
#> 
#> [[2]][[2]]
#> [1] NA
#> 
#> [[2]][[3]]
#> [1] "D"
#> 
#> [[2]][[4]]
#> [1] NA
#> 
#> [[2]][[5]]
#> [1] NA

Here are two ways to do what you want. One is more general and involved, second is doing exactly what you want, but won't work with, for example, more deeply-nested lists.

library(tidyverse)

x <- list(list(c("A", "B"), NA, NA, NA), 
          list(c("C","E"), NA, "D", NA, NA))

remove_empty <- function(x){
  if(is.list(x)) {
    x %>%
      purrr::discard(rlang::is_na) %>%
      purrr::map(remove_empty)
  } else {
    x
  }
}

intended.answer <- list(list(c("A", "B")), 
                        list(c("C","E"),"D"))

res1 <- x %>% map(., discard, .p = rlang::is_na)

res2 <- remove_empty(x)

all.equal(intended.answer, res1)
#> [1] TRUE
all.equal(intended.answer, res2)
#> [1] TRUE

Created on 2019-02-07 by the reprex package (v0.2.1)

2 Likes

Many thanks for your reply! Wouldn't have been able to figure it out by myself.

2 Likes

FWIW, I'd have done it with llply; not sure whether it is more or less efficient than mishabalyasin solution, but here is the is the llply solution:

library(plyr)

#remove NA from am flat list
flat.list.rm.na <- function(l) {
    if (is.list(l) && any(is.na(l))) {
        l %>% .[!is.na(.)]
    } else {
       l
   }
}

remove.empty <- function(l) llply(l, flat.list.rm.na)

#testing

x <- list(list(c("A", "B"), NA, NA, NA), 
          list(c("C","E"), NA, "D", NA, NA))

remove.empty(x) -> result
intended.result <- list(list(c("A", "B")), 
                        list(c("C","E"),"D"))
identical(result, intended.result)
#[1] TRUE

It's easy to generalize it to work with all nesting levels.
Hope that helps.

3 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.