Tidy evaluation: correct programming pattern when a quosure is used as LHS of mutate or to select a column

I've quickly became an addict to quosures, and I'm refactoring all the code in my EDAs so that it works with datasets having different headers. I found two difficulties:

  1. when testing assertions (e.g., in the code below, testing that the variable corresponding to a quosure enquo(factor_var) is a factor), code of the form stopifnot(is.factor(dataframe[[!! factor_var])) doesn't work, nor does code of the form stopifnot(is.factor(select(dataframe, !! factor_var))). As you can see below, I had to use quo_name.
  2. How to get mutate to work when the quosure is on the left hand side of the mutate assignment? Again, here quo_name + !! + := seems the only way. Not very readable...but at least, once committed to memory, very easy to reapply every time is needed.

Here is my code: the test_function must check if factor_var is a factor, or abort. If it's a factor, then it mutates all the NA to extra factor levels using forcats::fct_explicit_na().

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(rlang)
library(forcats)

test_function <- function(dataframe, factor_var){
  
  factor_var <- enquo(factor_var)
  factor_var_name <- quo_name(factor_var)
  stopifnot(is.factor(dataframe[[factor_var_name]]))

  dataframe <- dataframe %>%
  mutate(!! factor_var_name := forcats::fct_explicit_na(!! factor_var)) 
  
  return(dataframe)
}

my_iris <- iris
n <- nrow(my_iris)
index <- sample(seq_len(n),10)
my_iris$Species[index] <- NA

my_iris <- test_function(my_iris, Species)

Created on 2018-09-11 by the reprex package (v0.2.0).

Question: by introducing quo_name in the stopifnot and mutate assignment, did I do the right thing? Or are there simpler solutions?

I would say, it's a perfectly reasonable solution in your case.
The reason why stopifnot(is.factor(select(dataframe, !! factor_var))) didn't work is because you are using a select when in fact you should use pull. pull will return a vector (which you can then test for being a factor), while select will return a dataframe with one column. Therefore, testing whether it is a factor or not is not something that you want to do.
Finally, LHS of the := operator must be a string or symbol, not a quosure. So, in fact, you need to convert it to one of those things either way. Once you have it (as a string, for example), you might as well use it in dataframe[[factor_var_name]] to pull out a column.
Given all of that, as I've said, it looks like a reasonable solution :slight_smile:

2 Likes