R Style Guide -- Explicit return, or not?

AJF · September 17, 2019, 1:09pm

Hey all,

This is a bit of a more general question:

In the Tidyverse R Style Guide, it is suggested that, when writing functions, we only use return() for exiting the function early, but in general we should be relying on R to return the result of the last evaluated expression.

In the Google R Style Guide, it is suggested that returns from functions should always be explicit.

What is the tidyverse argument for not explicitly using return()? I recognize that R is special (unlike, say, Python) and we don't have to explicitly return...but what is the cost? I feel like explicitly returning the result would follow the general tidyverse themes of being as clear as possible.

mishabalyasin · September 17, 2019, 3:26pm

I don't have an answer, but to give you an example where explicit return is cumbersome, I often have functions that have something like this:

some_func <- function(df){
  df %>%
     fun1() %>%
     fun2() %>%
     fun3()
  ...
}

In this case I would need to create a temp variable and then use return on it, but (personally) it would look more busy than it should/could.

AJF · September 17, 2019, 4:25pm

I can see that situation as not needing a return.

I am thinking about a situation where I have to print the object anyway -- for example:

some_func <- function(df, new_names) {
  names(df) <- new_names

  df
}

vs.

some_func <- function(df, new_names) {
  names(df) <- new_names

  return(df)
}

To me, the second function seems much clearer - so I'm surprised that the style guide would tell me not to -- but maybe I am just weird . I do see why preferring return would be cumbersome when the end of a function is primarily one long chain of %>%s

(and yes, I am aware of set_names() that does my some_func already )

HenrikBengtsson · September 17, 2019, 8:38pm

I'd say one argument for not using return() is that it better convey the property of a functional programming language (which R is). If you have a mathematical function f defined as:

f(x) = x^2

you can define this in R as:

f <- function(x) x^2

We don't really think about mathematical functions as they return something, or even that they perform a sequence of calculations or has side effects. Instead, functions take on a value for a given input. Functional programming languages attempt to "emulate" this mental model as far as possible but, of course, if you ask for the value there will be a sequence of calculations taking place internally but the idea is that you should not have think about that.

BTW, this is probably also why the R help format (Rd) uses \value{...} to describe the value of an R function - it does not use \return{...} (though roxygen2 hides this via its @return ... tag). The term 'return' is more common for procedural programming languages.

Fer · September 20, 2019, 12:20pm

I have been wondering about this question also for some time. I know there is also personal opinions regarding what adds clarity (do_something() vs doSomething(), very personal and arguable on each side) and what is a mess (Try.To_Makesomething()). Reading different style guides have enlighten me a lot on how to write code, but this thing of not recommending using return have always intrigued me. I do use it a lot, as I often want to explicitly return or not when my functions include making plots, which as of late are most of them (i.e. return vs invisible).
Each time I look backwards to some code wrote barely a few months ago I always scream of how bad I did. Either by syntax or by design.
Cheers

gringer · September 23, 2019, 9:39pm

This should work, with no need for temporary creation:

some_func <- function(df){
  df %>%
     fun1() %>%
     fun2() %>%
     fun3() %>%
     return()
}

gringer · September 23, 2019, 9:49pm

A purely functional language only has its state stored in the arguments of functions, which in most cases means that the order of execution of a function body is irrelevant - the exceptions generally being where something timey is involved, e.g. input from a mouse click.

R is not such a language; it has a concept of a global state, and an order of operations. R functions can be represented in a functional way, but they don't have to be. It looks like what AJF is arguing for is that where multiple statements exist in the function body, an explicit return is a good idea. This is especially the case where the code needs to go out of its way to [re]state the return value.

If the last evaluated expression is the only evaluated expression, then an explicit return is not needed because there's no ambiguity about what gets sent back.

system · October 14, 2019, 9:49pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.