The reference for across and if_any includes a line that says if_any (.cols = everything(), ...). I thought that meant that if I did not supply a .cols argument, everything() would be used, but that is not what I observe.
Can someone explain to me what that notation means, or what I am doing wrong?
Example trying to select all rows with any variable greater than 50.
library(tidyverse)
t =
tribble(~x, ~y,
10, 100,
70, -140,
-60, 30,
-121, -300)
t %>% filter(if_any(.fn = ~ . > 50))
#> # A tibble: 0 x 2
#> # … with 2 variables: x <dbl>, y <dbl>
t %>% filter(if_any(everything(), .fn = ~ . > 50))
#> # A tibble: 2 x 2
#> x y
#> <dbl> <dbl>
#> 1 10 100
#> 2 70 -140
There seems to be a related erratic issue with if_all().
I've found a couple of reported issues on the Github repo tidyverse / dplyr, but I've not extensively investigated if these are exactly covering this strange behavior. But to me, it also appears not to be working as expected.
library(tidyverse)
t = tribble(~x, ~y,
10, 100,
70, -140,
-60, 30,
-121, -300
)
# if_all() ----------------------------------------------------------------
## no .cols argument specified: row 4 should not have been returned
t %>% filter(if_all(.fn = ~ . > -200))
#> # A tibble: 4 x 2
#> x y
#> <dbl> <dbl>
#> 1 10 100
#> 2 70 -140
#> 3 -60 30
#> 4 -121 -300
## .cols argument provided: expected behavior
t %>% filter(if_all(everything(), .fn = ~ . > -200))
#> # A tibble: 3 x 2
#> x y
#> <dbl> <dbl>
#> 1 10 100
#> 2 70 -140
#> 3 -60 30
# if_any() ----------------------------------------------------------------
## no .cols argument specified: rows 1 + 2 expected to be returned
t %>% filter(if_any(.fns = ~ . > 50))
#> # A tibble: 0 x 2
#> # ... with 2 variables: x <dbl>, y <dbl>
## .cols argument provided: expected behavior
t %>% filter(if_any(everything(), .fn = ~ . > 50))
#> # A tibble: 2 x 2
#> x y
#> <dbl> <dbl>
#> 1 10 100
#> 2 70 -140
Correct answer: the change to the default being .everything() was made Feb 12, 2021 so having dplyr 1.0.4 installed is not recent enough to have this change.
@nirgrahamuk
Your example does not accurately parallel the situation if if_any.
In the source code you can see that the .cols argument of if_any actually does have a default argument. If you can be a little more careful posting a reply we would keep the signal/noise ratio high.
Thanks for the pointer to look in the source code, though.
I diagnosed the issue you had for dplyr 1.0.4 and explained why.
the solution added to the dev version 12days ago simply underlines that my analysis of the situation as you presented it to us was correct...
If the documentation you read was not aligned with the current release of dplyr thats all that is.
Thank you for explaining. I did not understand that I was not looking at the code I was running, and that the code you referenced was from the dplyr 1.0.4 that I was using.