Best practice on naming inputs for functions

bjoerm · August 1, 2018, 2:00pm

Hey guys,
I would love to hear your opinion on how to best name inputs for functions. As functions in R are able to also use objects from the global environment, I explicitely name my "local" variables inside the function to avoid accidentally use a global object.

Minimalistic example on how I currently do it:

# Packages and options ----------------------------------------------------

library(dplyr)

speed_filter <- 10

# Some Function -----------------------------------------------------------

some_function <- function(cars_input, speed_filter_input) {  # I often have more than one data frame that enters a function, so just naming the input "data" is not an option. ;-)
  
speed_filter_input <- speed_filter_input * 2 # Just an operation to show why I want to have an explicit distinction between the local entity of speed_filter(_input) and the global speed_filter variable.

  cars_input <- cars_input %>% 
    dplyr::filter(speed >= speed_filter_input)  

  return(cars_input)  
}

some_function(
  cars_input = cars
  , speed_filter_input = speed_filter
)

Is there any better convention than using the suffix "_input"?

I have seen some functions that use a . (dot) in front of every input. Is that done to have a distinction with objects in the global environment or is there any other reason?

Thanks for your input in advance.

nutterb · August 1, 2018, 2:23pm

I think you're worrying about something that doesn't require any worry.

The condition on which R will move out of the function environment to look for a variable is if the variable is not found in the list of arguments. If, however, there is a matching argument name, it will restrict itself to the function environment only.

The following is a trivial example, and shows that as long as y is an argument in the function, I can not inadvertently access the y in the global environment (at least not without undergoing shenanigans that probably ought to be avoided anyway)

y <- 7

add <- function(x, y){
  x + y
}

add_danger <- function(x){
  x + y
}

add(x = 3, y = y)
#> [1] 10
add(x = 3 ,y = 7)
#> [1] 10
add(x = 3) # without giving the `y` argument, I can't complete the function
#> Error in add(x = 3): argument "y" is missing, with no default
add_danger(x = 3) # since this has not `y` argument, R will look to the global environment to find it.
#> [1] 10

Created on 2018-08-01 by the reprex package (v0.2.0).

bjoerm · August 1, 2018, 2:38pm

Thank you very much for your explanation and the good example, @nutterb. In my most important R scripts I prefer to get an error, if I forget to define the function input properly. But that is of course depending on one's likelihood to worry.

Maybe someone else also share my worriness and also defines local variables differently?

Can you also explain, why some functions use a dot in front of their function input? I have seen this for example in purrr where map has dots in front of the function inputs: map(.x, .f, ...).

martin.R · August 1, 2018, 2:40pm

There is a discussion around "to dot or not to dot" here: