While reading this Introduction to Tidyeval by @lionel I was struck by this statement:
Indirect references in quoting functions are rarely useful in scripts but they are invaluable for writing functions.
In my R scripts I find that I use indirect references in quoting functions all the time. I wonder if this is good practice?
Here's an example select and join using explicit variable names (adapted from r4ds):
library(tidyverse)
library(nycflights13)
flights2 <- flights %>%
select(year:day, hour, origin, dest, tailnum, carrier)
flights2 %>%
left_join(weather, by = c("year", "month", "day", "hour"))
When I write scripts like this I tend to abstract out the column names as a list of symbols. Typically my script has multiple data wrangling operations with an interdependence on certain lists of variables (particularly for grouping and merging) so putting them in a list keeps out bugs and focuses the text of the code on the internal logic of the operations.
timedims <- exprs(year, month, day, hour)
geodims <- exprs(origin, dest)
planedims <- exprs(tailnum, carrier)
flights2 <- flights %>%
select(!!!timedims, !!!geodims, !!!planedims)
flights2 %>%
left_join(weather, by = as.character(timedims))
I started out using quos()
but have gravitated towards exprs()
because it is easier to extract the symbols as a string. The most common reason I have to do this is the by
list in *_join()
. The alternative with quosures gets complicated. Is there a quos_name()
function in the works?:
timedimq <- quos(year, month, day, hour)
flights2 %>%
left_join(weather, by = map_chr(timedimq, quo_name))
Note that select() supports sequences but *_join() throws an error. I don't see a good way around this.
timedims2 <- exprs(year:day, hour) # this is more compact
flights2 <- flights %>%
select(!!!timedims, !!!geodims, !!!planedims) # works here
flights2 %>%
left_join(weather, by = as.character(timedims)) # doesn't work here
What do you use to manage lists of symbols in your scripts?
Are there any plans for the *_join()
functions to pick up support for lists of symbols in the by
parameter?