Return on conditional from the middle of a pipe


#1

I would like to “break” out of a pipe (and function) based on a conditional statement. There is a similar question here although it only evaluates a conditional in the middle of a pipe (i.e. without a return).

library(dplyr)
fnx <- function(data, val) {
  data %>%
    filter(mpg < val) %>%
    {if(nrow(.) == 0) return(NA) else .} %>%
    arrange(mpg)
}

fnx(mtcars, 22)
#works as expected since the conditional evaluates to FALSE

fnx(mtcars, 10) 
#continues to run the pipe despite the fact that the 
#conditional is TRUE resulting in "Error in UseMethod("arrange_") : 
#no applicable method for 'arrange_' applied to an object of class "logical""

Any help would be greatly appreciated!

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2 dplyr_0.7.4 

loaded via a namespace (and not attached):
 [1] compiler_3.4.3    httr_1.3.1        magrittr_1.5      assertthat_0.2.0 
 [5] R6_2.2.2          rgeolocate_1.0.1  pillar_1.1.0.9000 glue_1.2.0       
 [9] tibble_1.4.2      curl_3.1          Rcpp_0.12.15      jsonlite_1.5     
[13] pkgconfig_2.0.1   rlang_0.2.0       bindr_0.1  

#2

This might just be a case where a single pipe isn’t a great idea, but you could try using do(), maybe like this:

library(dplyr)
fnx <- function(data, val) {
  data %>%
    filter(mpg < val) %>%
    do(
      if(nrow(.) == 0)
        .
      else
        arrange(., mpg)
    )
}

fnx(mtcars, 22)

fnx(mtcars, 10) 

This still doesn’t let you return NA (do has to return a data frame), but might be sufficient?


#3

I would argue that this is an unwise idea. Especially within the confines of a function. Afterall, the point of a function is to abstract away the details. Whether or not a pipe gets broken into two or three pieces is rather immaterial.

Specific concerns I would bring up:

  1. returns are very important statements. They should stand out in the code. Burying it inside a pipe obscures it. In fact, it took me several reads of your function before it really stood out to me that you were attempting to terminate the function there.

  2. Your function returns NA in some cases and a data frame in others. If your function is going to be used in a chain, the differing behavior could cause some chaos. It may be preferable to simply return the 0-row data frame over.


#4

@ben-e Thanks for your response. Technically, your answer is sufficient. I guess part of my question was “how can I accomplish this” and the other half was “why doesn’t this work” (it was a bit late in the evening when I wrote the question and I probably should have been more clear about the 2nd half). After a good nights sleep, I think I will probably break the pipe into several sections especially since the actual use case doesn’t end with the arrange function but goes on to call several more functions in the pipe, which would look real ugly inside a do statement. Although, I am still a bit perplexed concerning the “why doesn’t this work” part…

@nutterb 2) doesn’t apply in the actual use-case but thanks for your input as well.