Using the construct any(X %in% Y) as a filter for a data frame containing vector elements does not give the correct results.
Reproducible example:
> df <- data.frame(K=c(1,2,3), L=rep(NA, 3))
> df$L<-list(c("a","b","c"), c("b","c","d"), c("c","d","e"))
> df
K L
1 1 a, b, c
2 2 b, c, d
3 3 c, d, e
>
> M <- c("a", "e")
>
> M %in% c("a","b","c")
[1] TRUE FALSE
> any(M %in% c("a","b","c"))
[1] TRUE
> M %in% c("b","c","d")
[1] FALSE FALSE
> any(M %in% c("b","c","d"))
[1] FALSE
> M %in% c("c","d","e")
[1] FALSE TRUE
> any(M %in% c("c","d","e"))
[1] TRUE
>
> filtered2 <- df[any(M %in% df$L),]
> filtered2
[1] K L
<0 rows> (or 0-length row.names)
> nrow(filtered2)
[1] 0
>
> # Dplyr gives the same results, as does mutate
>
> library(dplyr)
>
> filtered <- df %>% filter(any(M %in% L))
> nrow(filtered)
[1] 0
>
> df %>% mutate(X=any(M %in% L))
K L X
1 1 a, b, c FALSE
2 2 b, c, d FALSE
3 3 c, d, e FALSE
As demonstrated in the direct value testing, the first and third rows should have been returned from the filter and the values of X for mutate should have been TRUE, FALSE, TRUE. Not FALSE, FALSE, FALSE.