Okay, so I quite enjoyed the challenge of figuring the solution out to this one. It's an interesting one. The way that the pipe works, it creates a sequence of functions like below:
function(.)
testfun(.)
and evaluates them in turn, applying the new function to the last one. This means that the variable being passed to testfun()
, when used in a pipeline is actually .
and it has the same value as whatever the previous step was. Within the calling of these functions, the pipe actually calls it value
and that's why if you play with some of the {rlang}
functions, you'll get value
out
myDF %>% ensym
In this case, it's the same value as myDF
, but it's now got a different name, .
, and that's why it gives that result. When you think of it this way, your function is doing exactly what it's supposed to. It's the pipe that's being weird.
You can, however, look back over the call-stack where the current function is being evaluated (which is what error-finding functions like traceback()
do). Within the pipe, it actually creates a relatively deeply nested set of calls (about 9 calls deep). However, the sys.calls()
function can return this stack. Compare for example the following two outputs:
stack_fun <- function(x){
sys.calls()
}
stack_fun(myDF)
myDF %>% stack_fun
The first element of this stack will be the initial call, in this case myDF %>% stack_fun()
. This will be a call
object and so we can pull out the left-hand side by extracting the second element (the %>%
is the first element, and stack_fun()
is the third). Therefore, the testfun()
function can be written as:
testfun <- function(objName){
first_call <- sys.calls()[[1]] #get the first entry on the call stack
lhs <- first_call[[2]] #get the second element of this entry
z <- rlang::as_name(lhs)
print(z)
}
myDF %>% testfun()
But, that's not the end of our tale!
This is just looking for the initial call, and isn't strictly going to seek out where there is a pipe. For example, it wouldn't work with the following function, since f()
would be at the top of the stack:
f <- function(x){
x %>% testfun
}
And, in theory you would want this to return "x", since that's what's being piped into testfun()
. This could also cause other problems when nested inside other functions and/or pipelines, etc... It's only ever looking at what the user has called, which is not necessarily where you want this function to look.
However, by inspecting the entire stack for a pipe, we can pull out the most recent (i.e. the lowest) entry that is a pipe:
get_lhs <- function(){
calls <- sys.calls()
call_firsts <- lapply(calls,`[[`,1)
pipe_calls <- vapply(call_firsts,identical,logical(1),quote(`%>%`))
if(all(!pipe_calls)){
NULL
} else {
pipe_calls <- which(pipe_calls)
pipe_calls <- pipe_calls[length(pipe_calls)]
calls[[c(pipe_calls,2)]]
}
}
So, you can re-write your testfun()
function to be:
testfun <- function(objName){
lhs <- get_lhs()
if(is.null(lhs)){
lhs <- rlang::ensym(objName)
}
z <- as_name(lhs)
print(z)
}
This means that the following both return "myDF"
:
testfun(myDF)
myDF %>% testfun
These will return "x"
:
f(myDF)
myDF %>% f
And this even works with fseq
-style functions in an interesting way
g <- . %>% testfun
This is a function, which we can use in one of two ways, either as a regular function (e.g. g(myDF)
) or by piping into it (e.g. myDF %>% g
), and these return two different results
g(myDF) #returns "."
This is because it's essentially the same as defining g()
as a function:
g <- function(.){
. %>% testfun
}
So, this makes sense. BUT when we pipe it, it gets weird, but still a good result:
myDF %>% g # returns "myDF"
This is because it's essentially chaining the two pipelines together into a single, longer chain (much more apparent it you had many elements in your two pipelines)
Sorry for the long answer, but I thought this was an interesting challenge. I've recently started a blog about my adventures in R and coding, and so I think I'm going to copy this long-winded response into a post on there. So thank you for the inspiration 