Confused about converting filter_at + any_vars to filter + across in dplyr 1.0

I'm trying to filter a dataset to get only a specific regex across multiple columns (an address that could be in 6 to 10 columns). I'm using filter_at which solves the problem, but, in dplyr documentation (filter_all) it says filter_at is superseded, but I don't understand how I'm supposed to use the combination of filter + across.

library(tidyverse)

df <- tribble(
  ~id, ~name, ~dir, ~something, ~dir_asd,
  1, "a", "address", 15, "some other",
  2, "b", "someplace", 10, "address",
  3, "c", "localhost", 2, "::1"
)
filter_at(df, vars(starts_with("dir")), any_vars(str_detect(., "address") == TRUE))
#> # A tibble: 2 x 5
#>      id name  dir       something dir_asd   
#>   <dbl> <chr> <chr>         <dbl> <chr>     
#> 1     1 a     address          15 some other
#> 2     2 b     someplace        10 address

filter(df, across(starts_with("dir"), ~ str_detect(.x, "address") == TRUE))
#> # A tibble: 0 x 5
#> # ... with 5 variables: id <dbl>, name <chr>, dir <chr>, something <dbl>,
#> #   dir_asd <chr>

filter(df, across(dir, ~ str_detect(.x, "address") == TRUE))
#> # A tibble: 1 x 5
#>      id name  dir     something dir_asd   
#>   <dbl> <chr> <chr>       <dbl> <chr>     
#> 1     1 a     address        15 some other

Created on 2020-06-23 by the reprex package (v0.3.0)

The last case partially works, which lead me to thing that the selection is working starts_with, but it's trying to match every column instead any. Adding any_vars to the mix, doesn't work:

filter(df, across(starts_with("dir"), ~ any_vars(str_detect(.x, "address") == TRUE)))
#> Error: Input must be a vector, not a `any_vars/quosure/formula` object.
#> Run `rlang::last_error()` to see where the error occurred.

Looking into across + any_vars, lead me here: https://github.com/tidyverse/dplyr/issues/4770, which ended in implementing c_across(), but I don't understand how it could be helpful here.

If you could point me in the right direction I would be very grateful.

This works, but not sure if you like it :grinning:

library(tibble)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)

df <- tribble(
  ~id, ~name, ~dir, ~something, ~dir_asd,
  1, "a", "address", 15, "some other",
  2, "b", "someplace", 10, "address",
  3, "c", "localhost", 2, "::1"
)

rowAny <- function(x) rowSums(x) > 0

df %>%
filter(rowAny(
       across(
  .cols = starts_with("dir"), 
  .fns  = ~ str_detect(.x, "address") == TRUE
)))
#> # A tibble: 2 x 5
#>      id name  dir       something dir_asd   
#>   <dbl> <chr> <chr>         <dbl> <chr>     
#> 1     1 a     address          15 some other
#> 2     2 b     someplace        10 address

Created on 2020-06-23 by the reprex package (v0.3.0)

superceded just means that you should be aware of the successor as it is thought to most likely be more useful to you more often, but you can continue to use superceded functions, they are not considered at risk of deprecation/removal. Its common for certain functions to have strength in particular /narrow cases over others that may generally be preferred. Don't feel bad if you decide to continue using filter_at for the sorts of tasks you are asking about, (if you prefer it), theres no shame in it.

1 Like

Thanks, but it doesn't answer the question :slight_smile:

Yeah, I totally understand. I was changing all my code using across, and this is the only case I couldn't find a way to do it :/. I understand that it might be a corner case, or maybe that I should wait a bit.

I did not see a question. Apparently you had one :grinning:

Well, it's just how I'm supposed to use the across idiom when using filter_at.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.