Using variable name in a function

Hi
I have written a function that renames the first column and then checks if the values are of lenght 2 and contain a letter. If yes, add a "0" at the beginning. I read the Programming with tidyverse but got lost so I tried the bang-bang approach. What is not working, is using the new name in the ifelse-condition. I tried with {{ }}, but that didn't work:

library(tidyverse)
add_zero <- function(data, pre) {
 data %>%
    rename(!!pre := colnames(data)[1]) %>%
    mutate(!!pre := ifelse(
      str_length(G) == 2 & str_detect(G, "[:alpha:]$"),
      paste("0", G, sep = ""),
      G
    ))
}
testdata <- tibble(sec = c("01", "2a"), year = c(2014, 2015))
(test <- add_zero(testdata, "G"))type or paste code here

I would like to have something like this and get rid of the bang bang (if possible) so that I can use another "pre" (e.g. "A" instead of "G"):

library(tidyverse)
add_zero <- function(data, pre) {
 data %>%
    rename(!!pre := colnames(data)[1]) %>%
    mutate(!!pre := ifelse(
      str_length({{ pre }}) == 2 & str_detect({{ pre }}, "[:alpha:]$"),
      paste("0", {{ pre }}, sep = ""),
      {{ pre }}
    ))
}

Cheers
Renger

I get confused with this stuff myself.

I think what's needed is an ensym() call at the start of the function - this enables the function to treat pre as a symbol and help the curly-curly to work throughout. Otherwise I think some of the internal functions treat it as a string. (I'm sort of guessing in the dark here tbh. Occasionally I think I understand, and other times it still doesn't make sense to my brain).

library(dplyr)
library(stringr)

testdata <- tibble(sec = c("01", "2a"), year = c(2014, 2015))

add_zero <- function(data, pre) {
  pre <- ensym(pre)
  data %>%
    rename({{ pre }} := 1) %>% 
    mutate({{ pre }} := if_else(
      str_length({{ pre }}) == 2 & str_detect({{ pre }}, "[:alpha:]"),
      paste0("0", {{ pre }}),
      {{ pre }}
    ))
}

testdata %>% 
  add_zero("G")
#> # A tibble: 2 x 2
#>   G      year
#>   <chr> <dbl>
#> 1 01     2014
#> 2 02a    2015

Created on 2020-09-24 by the reprex package (v0.3.0)

Thanks, that does the trick. Good to know that I am not the only getting confused. I thought that we could refrain from using ensym now that the {{ }} is introduced, but that is probably wrong.
Cheers
Renger

You don't need ensym() if you're using {{ }}. I think it's also important to realize that the stringr functions are not data masking. Therefore, it makes no sense to use {{ }} inside them. You should just supply strings. The easiest way to do this is via the .data pronoun.

library(dplyr, warn.conflicts = FALSE)
library(stringr)

testdata <- tibble(sec = c("01", "2a"), year = c(2014, 2015))

add_zero <- function(data, pre) {
  data %>%
    rename({{ pre }} := 1) %>%
    mutate({{ pre }} := if_else(str_length(.data[[pre]]) == 2 & str_detect(.data[[pre]], "[:alpha:]"),
      str_c("0", .data[[pre]]), .data[[pre]]
    ))
}

add_zero(testdata, "G")
#> # A tibble: 2 x 2
#>   G      year
#>   <chr> <dbl>
#> 1 01     2014
#> 2 02a    2015

Created on 2020-09-25 by the reprex package (v0.3.0)

1 Like

No problem. The way I remember it is that {{ }} is an operator that performs quoting & unquoting in a single step (which is nice shorthand for common use cases like supplying variables to data masking functions). Functions like ensym(), enquo() etc. only perform quoting. To unquote, you need to use !!. These can be used when you need finer control over quoting and evaluation of expressions.

1 Like

Sorry for jumping in after a working solution from @siddharthprabhu.

I also tend to get confused with quoting and unquoting. At the end it is a matter of taste and use case. The question to me is always:

How do I want to call my function?

Renger (@arnyeinstein) requested a call with a string: add_zero(testdata, "G"). However, one idea of writing tidyverse-like functions was to hand over expressions by just "writing" them, i.e.
add_zero(testdata, G). This would be @francisbarton's first solution without ensym.

The question is, will your "G" be an explicit "G" in your source code or will it be some string variable? You could then also call your function like this.

new_column_name <- "G"
add_zero_without_ensym(testdata, sym(new_column_name))

# or
add_zero_accepting_string(testdata, new_column_name)

On the other side, there are tidyverse functions which accept explicit strings if new columns are created, like in gather.

1 Like

Thanks @siddharthprabhu that helps make things clearer. I knew there was no need in theory for ensym() with curly-curly but the function didn't work properly for me without it ... because I didn't think about using .data[[]] as you suggest. It'll all seem obvious one day!

1 Like