I often have to handle complex filters in my code. Example:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
test <- structure(list(type = c("Clar", "Tech", "Clar", "Clar", "Clar",
"Clar", "Tech", "Tech", "Tech", "Clar", "Clar", "Clar", "Tech",
"Tech", "Tech", "Clar", "Clar", "Clar", "Tech", "Tech", "Clar"
), class = c("A", "B", "A", "A", "A", "A", "B", "B", "B", "A",
"A", "A", "B", "B", "B", "A", "A", "A", "B", "B", "A"), owner = c("team A1",
"team A1 NY", "team A3", NA, "team A2", NA, "team A1", "team A1",
"team A1 SF", "team A1", "team A1", "team A2", "team A1", "team A1",
"team A1 NY", NA, "team A1 SF", "team A1", "team A1", "team A1",
"team A3"), answer = structure(c(NA, 1L, NA, NA, NA, NA, 1L,
1L, 1L, NA, NA, NA, 1L, 1L, 1L, NA, NA, NA, 1L, 1L, NA), .Label = c("Accept",
"Change Solution"), class = "factor"), complexity = structure(c(2L,
NA, 2L, 2L, 2L, 2L, NA, NA, NA, 2L, 2L, 2L, NA, NA, NA, 2L, 2L,
2L, NA, NA, NA), .Label = c("High", "Low", "Medium"), class = "factor")), row.names = c(NA,
21L), class = "data.frame")
test_filtered_1 <- test %>%
filter(type == "Clar" &
complexity == "Low")
test_filtered_1
#> type class owner answer complexity
#> 1 Clar A team A1 <NA> Low
#> 2 Clar A team A3 <NA> Low
#> 3 Clar A <NA> <NA> Low
#> 4 Clar A team A2 <NA> Low
#> 5 Clar A <NA> <NA> Low
#> 6 Clar A team A1 <NA> Low
#> 7 Clar A team A1 <NA> Low
#> 8 Clar A team A2 <NA> Low
#> 9 Clar A <NA> <NA> Low
#> 10 Clar A team A1 SF <NA> Low
#> 11 Clar A team A1 <NA> Low
test_filtered_2 <- test %>%
filter(type == "Tech" &
class == "B" &
str_detect(owner, "A1") &
answer == "Accept")
test_filtered_2
#> type class owner answer complexity
#> 1 Tech B team A1 NY Accept <NA>
#> 2 Tech B team A1 Accept <NA>
#> 3 Tech B team A1 Accept <NA>
#> 4 Tech B team A1 SF Accept <NA>
#> 5 Tech B team A1 Accept <NA>
#> 6 Tech B team A1 Accept <NA>
#> 7 Tech B team A1 NY Accept <NA>
#> 8 Tech B team A1 Accept <NA>
#> 9 Tech B team A1 Accept <NA>
test_filtered_3 <- test %>%
filter((type == "Clar" &
complexity == "Low") |
(class == "B" &
str_detect(owner, "A1") &
answer == "Accept"))
test_filtered_3
#> type class owner answer complexity
#> 1 Clar A team A1 <NA> Low
#> 2 Tech B team A1 NY Accept <NA>
#> 3 Clar A team A3 <NA> Low
#> 4 Clar A <NA> <NA> Low
#> 5 Clar A team A2 <NA> Low
#> 6 Clar A <NA> <NA> Low
#> 7 Tech B team A1 Accept <NA>
#> 8 Tech B team A1 Accept <NA>
#> 9 Tech B team A1 SF Accept <NA>
#> 10 Clar A team A1 <NA> Low
#> 11 Clar A team A1 <NA> Low
#> 12 Clar A team A2 <NA> Low
#> 13 Tech B team A1 Accept <NA>
#> 14 Tech B team A1 Accept <NA>
#> 15 Tech B team A1 NY Accept <NA>
#> 16 Clar A <NA> <NA> Low
#> 17 Clar A team A1 SF <NA> Low
#> 18 Clar A team A1 <NA> Low
#> 19 Tech B team A1 Accept <NA>
#> 20 Tech B team A1 Accept <NA>
Created on 2020-02-24 by the reprex package (v0.3.0)
Rather than having to copy the filter code in two different parts of my script, is there a way to programmatically define filters, and then pass them to filter
? I know that in this specific case, join
ing test_filtered_1
and test_filtered_2
would return the same dataframe as test_filtered_3
, even if the filters are not the same. But what I'm looking for is something (not run, because it would give an error):
filter_1 <- type == "Clar" &
complexity == "Low"
test_filtered_1 <- test %>%
filter(filter_1)
filter_2 <- type == "Tech" &
class == "B" &
str_detect(owner, "A1") &
answer == "Accept"
test_filtered_2 <- test %>%
filter(filter_2)
filter_3 <- (type == "Clar" &
complexity == "Low") |
(class == "B" &
str_detect(owner, "A1") &
answer == "Accept")
test_filtered_3 <- test %>%
filter(filter_3)
I guess this should be doable with functions?