When should I use `all_of(vars)` instead of just `vars`?

Let me start with an example.

library(tidyverse)

my_tbl = tibble(a = 1, b = 1, c = 1)

vars = c("a", "b")

my_tbl %>% select(vars)
## A tibble: 1 x 2
#      a     b
#  <dbl> <dbl>
#1     1     1

my_tbl %>% select(all_of(vars))
## A tibble: 1 x 2
#      a     b
#  <dbl> <dbl>
#1     1     1

vars = c("a", "b", "d")

my_tbl %>% select(vars)
#Error: Can't subset columns that don't exist.
#x Column `d` doesn't exist.
#Run `rlang::last_error()` to see where the error occurred.

my_tbl %>% select(all_of(vars))
#Error: Can't subset columns that don't exist.
#x Column `d` doesn't exist.
#Run `rlang::last_error()` to see where the error occurred.

As shown above, using select(vars) and select(all_of(vars)) result in the same outputs. So when should I use select(all_of(vars)) instead of just select(vars)?

If you're trying to select a column that doesn't exist, it will throw an error in both cases as shown in your example.

If you want to silently ignore the missing columns, use any_of() instead of all_of().

The latter is supposed to be safer choice here. Both all_of()and any_of() should be used when vars is a character vector of variable names.

1 Like

Thank you for the explanation, but actually, that doesn't answer my question.

Both all_of() and any_of() should be used when vars is a character vector of variable names.

Why should all_of(vars) be used instead of just vars where vars is a character vector of variable names? It seems to me that either of them does the same thing as shown in the example.

With tidyverse functions it is not necessary to use quotes to refer to variables. The consequence however is that ambiguity can arise when you specify a character variable.

An extreme example to make this clear.

library(dplyr)
Sepal.Length <- "Sepal.Width"

iris %>% select(Sepal.Length) %>% head(3)
#>   Sepal.Length
#> 1          5.1
#> 2          4.9
#> 3          4.7

iris %>% select(all_of(Sepal.Length)) %>% head(3)
#>   Sepal.Width
#> 1         3.5
#> 2         3.0
#> 3         3.2

For this reason it is better to use systematically any_of() or all_of() when working with character variables.

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.