Help with writing a custom pipe and environments

I'm currently in evironment hell. To make a long story short, I'm trying to write my own pipe. This is because I'm developing a package with some special objects and I want to write a pipe operator that could allow users to compose functions that compute over these special objects. (if you're interested, here's a blog post The {chronicler} package, an implementation of the logger monad in R)

Now my issue with the pipe I wrote is that it seems to work as long as I'm working interactively in the global env. If I call code that uses the pipe, user-defined functions don't get found. Here is a reproducible example. I have one script with the following code:

`%>=%` <- function(.x, .f, ...) {

  parsed <- parse_function(deparse1(substitute(.f)))

  cmd <- make_command(parsed)
  eval(parse(text = cmd))

}

make_command <- function(parsed_function){

  paste0(".x |> ",
         parsed_function$func,
         "(",
         parsed_function$args, ")"
         )

}

parse_function <- function(.f_string){

  func <- gsub("\\(.*$", "", .f_string)
  args <- stringr::str_extract(.f_string, "\\(.*")
  args <- gsub("^\\(", "", args)
  args <- gsub("\\)$", "", args)
  args <- ifelse(args != "", paste0(args, ", "), "")

  list("func" = func,
       "args" = args)

}

This defines my pipe. Here some tests to see that it works:

a <- c(seq(1:10), NA)

my_log <- log
my_mean <- mean

a %>=%
  my_mean(na.rm = TRUE) %>=%
  my_log()

log(mean(a, na.rm = TRUE))

No problems here (it is important that I define my_log and my_mean, because that's how my package works, by defining new functions that will work with the pipe).

Now suppose I have a script with this content:

test_that("test equality", {

  my_sqrt <- sqrt
  my_exp <- exp
  my_mean <- mean
  
  result_pipe <- 1:10 %>=%
    my_sqrt() %>=%
    my_exp() %>=%
    my_mean()

  result <- my_mean(my_exp(my_sqrt(1:10)))

  expect_equal(result_pipe, result)
})

Calling this script with testthat::test_file("test.R") errors with following message:

══ Testing test.R ═══════════════════════════════════════════════════════════════════════════════════════
[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

── Error (test.R:7:3): test equality ───────────────────────────────────────────
Error in `my_exp(.x)`: could not find function "my_exp"
Backtrace:
 1. 1:10 %>=% my_sqrt() %>=% my_exp() %>=% my_mean()
      at test.R:7:2
 5. 1:10 %>=% my_sqrt() %>=% my_exp()

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 0 ]

my_exp is not found (my_log and my_mean get found only if you run the main script first, so they're in the global env, but if you run in completely fresh session, then they won't get found either).

Thanks to some helpful people on twitter, I realised that the problem is due to environments. I tried some stuff, and tried googling around, tried using quosures etc, but nothing seems to work. Any help much appreciated.

I think the problem is that .x and .f in your pipe function live in different environments. Below is a workaround that you probably won't find helpful, since I'm just reassigning your function in .f, its name will be lost through this. But this shows that if we bring .f and .x in the same environment it works. I tried playing around making f_env child or parent environment of the execution environment of %>=%, but everything I tried crashed R. If we only use f_env in eval() then this time .x won't be found.

In the example below I use the magrittr pipe instead of |> because I'm not on 4.0 yet.

Ideally, we would find a way to solve this problem by rewriting the functions so that they use less string-to-language manipulation and instead work with calls and expressions. I gave it some thought in the last chunk, but I'm not sure how this attempt would fit together with your list containing args and the function name.

library(rlang)
library(testthat)
library(magrittr)

`%>=%` <- function(.x, .f, ...) {
  
  f_quo <- rlang::enquo(.f)
  f_exp <- rlang::quo_get_expr(f_quo)
  f_env <- rlang::quo_get_env(f_quo)
  
  parsed <- parse_function(deparse1(f_exp))
  .f <- get(x = parsed$func, envir = f_env, mode = "function")
  
  cmd <- make_command(parsed)
  eval(parse(text = cmd))
  
}

parse_function <- function(.f_string){
  
  func <- gsub("\\(.*$", "", .f_string)
  args <- stringr::str_extract(.f_string, "\\(.*")
  args <- gsub("^\\(", "", args)
  args <- gsub("\\)$", "", args)
  args <- ifelse(args != "", paste0(args, ", "), "")
  
  list("func" = func,
       "args" = args)
}

make_command <- function(parsed_function){
  
  paste0(".x %>% ",
         ".f",
         "(",
         parsed_function$args, ")"
  )
  
}



  
a <- c(seq(1:10), NA)
  
my_log <- log
my_mean <- mean
  
a %>=%
  my_mean(na.rm = TRUE) %>=%
  my_log()
#> [1] 1.704748
  
log(mean(a, na.rm = TRUE))
#> [1] 1.704748


test_that("test equality", {
  
  my_sqrt <- sqrt
  my_exp  <- exp
  my_mean <- mean
  
  result_pipe <- 1:10 %>=%
    my_sqrt() %>=%
    my_exp() %>=%
    my_mean()
  
  result <- my_mean(my_exp(my_sqrt(1:10)))
  
  expect_equal(result_pipe, result)
})
#> Test passed 😸

Created on 2022-04-03 by the reprex package (v0.3.0)

This is an attempt to rewrite %>=% without string manipulation, however I'm not sure what to do with the list containing the function name and its arguments. We can easily construct it though. Can this function solve your problem?

`%>=%` <- function(.x, .f, ...) {
  
  f_quo <- rlang::enquo(.f)
  f_exp <- rlang::quo_get_expr(f_quo)
  f_env <- rlang::quo_get_env(f_quo)
  f_chr <- deparse(f_exp[[1]])
  
  f <- get(f_chr, envir = f_env)
  q_ex_std <- rlang::call_match(call = f_exp, fn = f)
  expr_ls <- as.list(q_ex_std)
  
  # what should we do with this list ?!
  mylist <- list2("func" = f_chr,
                  "args" = expr_ls[-1])
  
  
eval(call2(f, .x, !!! expr_ls[-1]))

}

I just wanted to say that I can't reproduce your issue.
I saved the following as 'test.R'


test_that("test equality", {
  
  `%>=%` <- function(.x, .f, ...) {
    
    f_quo <- rlang::enquo(.f)
    f_exp <- rlang::quo_get_expr(f_quo)
    f_env <- rlang::quo_get_env(f_quo)
    
    parsed <- parse_function(deparse1(f_exp))
    .f <- get(x = parsed$func, envir = f_env, mode = "function")
    
    cmd <- make_command(parsed)
    eval(parse(text = cmd))
    
  }
  
  parse_function <- function(.f_string){
    
    func <- gsub("\\(.*$", "", .f_string)
    args <- stringr::str_extract(.f_string, "\\(.*")
    args <- gsub("^\\(", "", args)
    args <- gsub("\\)$", "", args)
    args <- ifelse(args != "", paste0(args, ", "), "")
    
    list("func" = func,
         "args" = args)
  }
  
  make_command <- function(parsed_function){
    
    paste0(".x %>% ",
           ".f",
           "(",
           parsed_function$args, ")"
    )
    
  }
  

  my_sqrt <- sqrt
  my_exp  <- exp
  my_mean <- mean
  
  result_pipe <- 1:10 %>=%
    my_sqrt() %>=%
    my_exp() %>=%
    my_mean()
  
  result <- my_mean(my_exp(my_sqrt(1:10)))
  
  expect_equal(result_pipe, result)
})

and I can restart R, go into a clean session and testthat::test_file("test.R")


== Testing test.R =========================================================================
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 1 ] Done!

@nirgrahamuk: I think you are using my workaround instead of the OPs minimal reproducible example. I also think that my workaround is not a good solution, but rather shows one way how the problem can be avoided.

Oh, thanks for pointing that out. My apologies.
However, I did go paste over the original posters functions inside the body of the test, and it passed similarly
Is there something about how the original poster keeps the test light, absent of the function definitions relied upon that makes the difference?
does the original poster set a source() to read in the functions he's making so that they would be available for the test in some way?
Its unfortunate that I can't begin to attempt to assist because there is no straightforward reproducible example as far as I can tell...

Hi Tim, thanks for your reply! Your second function (without string manipulation) works fine! I had to change it slightly to make it work on my package:

`%>=%` <- function(.c, .f, ...) {

  f_quo <- rlang::enquo(.f)
  f_exp <- rlang::quo_get_expr(f_quo)
  f_env <- rlang::quo_get_env(f_quo)
  f_chr <- deparse(f_exp[[1]])

  f <- get(f_chr, envir = f_env)


  q_ex_std <- rlang::call_match(call = f_exp, fn = f)
  expr_ls <- as.list(q_ex_std)


  eval(call2(f, .c$value, !!!expr_ls[-1], .log_df = .c$log_df))

}

Now I must admit I don't understand each line, especially the one with get(f_chr, envir = f_env) and the one with call_match. Would you mind giving some details as to how this works?

Also, why do you say that your solution is not good? This seems to work perfectly. Do you anticipate certain use cases where it might fail?

Many thanks in any case!

Hi nirgrahamuk, thanks for your reply! It is weird, using my original code, the test should not pass. I do not run source() nor anything else. I just have these two scripts, one where the code is defined, and the other where the test is defined. So when you run the test using testthat::test_file("test.R") it passes on your machine? Very weird.

ok, reinterpreting what you are saying. the testthat framework by design will run a test in a clean environment, so any prior sourcing and loading of objects that you do, before going to testthat, shouldnt have an impact, and testthat tests that want to see them, would be expected to fail...
a test should set up a local environment containing the things you'd expect it to have for the test to have a chance to go well.

Simpler and more transpartent version without rlang:

`%>=%` <- function(.x, .f, ...) {
  # 1. quote function call without evaluating args:
  f_call <- as.call(substitute(.f))
  # (substitutes arguments for .f with variables defined inside this function body, 
 #    so no substitutions will be actually done)
  
  # 2. inject .x as first argument into argument list
  f_args <- append(list(.x), as.list(f_call[-1]))
  
  #3. evaluate call to .f in the frame in which %>=% was called
  do.call(
          what  = get(f_call[[1]], mode = "function"), 
          args = f_args,
          env = parent.frame()
  )
}
1 Like

Yes we can rewrite the function in base R. But we need to use a string in get(). So the function would look like this:

`%>=%` <- function(.x, .f, ...) {
  # 1. quote function call without evaluating args:
  f_call <- as.call(substitute(.f))
  # (substitutes arguments for .f with variables defined inside this function body, 
  #    so no substitutions will be actually done)
  
  # 2. inject .x as first argument into argument list
  f_args <- append(list(.x), as.list(f_call[-1]))
  
  #3. evaluate call to .f in the frame in which %>=% was called
  do.call(
    what  = get(as.character(f_call)[[1]], mode = "function"), 
    args = f_args,
    env = parent.frame()
  )
}

Nevertheless, this doesn't solve the OPs problem. When using get() the default value for envir is parent.frame(). When running {testthat} or other functions with non-standard environment setups (e.g. knitting Rmarkdown) we will run into the same problem the OP mentioned in his post. {rlang}'s quosures are not the only solution, but they make things easy here.

Great to hear, that my function works! Sorry for not adding explanation (see below). I thought the rewrite of the pipe wouldn’t match what you were trying to do. I thought you needed some strings for logging like x |> function_name() which you want to write somewhere, and my approach didn't work with strings. But I’m glad it worked out!

Below some more context regarding my approach:

I guess the setup is straight-forward:
We capture .f as quosure with rlang::enquo(), then we get the expression with rlang::quo_get_expr() this equals substitute() and we get the environment of .f with rlang::quo_get_env(). This is really helpful later. Finally, we also need the function name as string: deparse(f_expr[[1]]).

The problem with your pipe was, that in non-standard environment setups .x and .f were not only located in different environments. .f existed in an environment which was not a parent of parent.frame(). So the function was looking up the search path, but couldn’t find .f. Since we now capture the environment in which .f can be found, we can get it and bring it to the execution environment of the pipe function it. To be safe, we should also specify the mode argument:

f <- get(f_chr, envir = f_env, mode = β€œfunctionβ€œ)

Now we need to get the arguments of the function call in .f.
For this I like rlang::call_match which is pretty similar to match.call.

We get the arguments when we specify the expression of the function call f_exp and then tell call_match to match the arguments with the function f. For your function this might be a little of an overkill, since you are not working with the actual argument names. call_match (and match.call) are helpful when we want to work with argument names and the user doesn't name each argument. Once we get the argument we transform this to a list with as.list.

Now can use rlang::call2 and splice the list of captured arguments using the tripple bang operator.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.