Help creating simple function

Hi all,

I'm trying to make a simple function to remove duplicate survey entries from my data set, but having a little trouble. Here's a toy data set for an example repo example:

dat=tribble(~name,
            "John",
            "Beverly",
            "Jane",
            "Randy",
            NA,
            "John",
            "john",
            "Bob",
            NA,
            "Carey",
            "Cassidy")

The following line of code accomplishes exactly what I'm looking for; it removes all entries of duplicated names from the indicated vector in the data frame.

dat %>%
  mutate(name=str_to_lower(name)) %>%
  na.omit(name) %>% 
  distinct(name, .keep_all=TRUE)

When I tried to turn the above code into a function however, I get an error that reads "problem with mutate() input x. obeject 'name' not found. Input 'x' is 'str_to_lower(x)'.

My function is below, here. Anyone have any suggestions/know what's going wrong?

drop_dupes=function(df, x){
  df=df %>%
    mutate({{x}}=str_to_lower({{x}})) %>%
    na.omit({{x}}) %>% 
    distinct({{x}}, .keep_all=TRUE)
  
  return(df)
}

dat %>% drop_dupes(name)

it might be due to R or dplyr version difference, but I got a different error/set of errors.

Error: unexpected '=' in:
"  df=df %>%
    mutate({{x}}="
>     na.omit({{x}}) %>% 
+     distinct({{x}}, .keep_all=TRUE)
Error in na.omit({ : object 'x' not found
>   
>   return(df)
Error: no function to return from, jumping to top level
> }
Error: unexpected '}' in "}"

The first of which relates to not being able to mutate things to dynamic names with the normal = operator, a colon-equals operator would be needed

drop_dupes=function(df, x){
  df=df %>%
    mutate({{x}}:=str_to_lower({{x}})) %>%
    na.omit({{x}}) %>% 
    distinct({{x}}, .keep_all=TRUE)
  
  return(df)
}
Error: unexpected '=' in:
1 Like

Thanks, your fix works great! Is there an article somewhere on the colon-equals operator? Never heard of that before and I'd like to know what it does and why it works here while my way did not

Its a part of rlang, you can make the relevant rlang documentation show up in your normal help area in Rstudio like this

 ?`:=`

Forcing names
When a function takes multiple named arguments (e.g. dplyr::mutate()), it is difficult to supply a variable as name. Since the LHS of = is defused, giving the name of a variable results in the argument having the name of the variable rather than the name stored in that variable. This problem of forcing evaluation of names is exactly what the !! operator is for.

Unfortunately R is very strict about the kind of expressions supported on the LHS of =. This is why rlang interprets the walrus operator := as an alias of =. You can use it to supply names, e.g. a := b is equivalent to a = b. Since its syntax is more flexible you can also force names on its LHS:

name <- "Jane"

list2(!!name := 1 + 2)
exprs(!!name := 1 + 2)
Like =, the := operator expects strings or symbols on its LHS.

Since unquoting names is related to interpolating within a string with the glue package, we have made the glue syntax available on the LHS of :=:

list2("{name}" := 1)
tibble("{name}" := 1)
You can also interpolate defused function arguments with double braces {{, similar to the curly-curly syntax:

wrapper <- function(data, var) {
data %>% mutate("{{ var }}_foo" := {{ var }} * 2)
}
Currently, forcing names with := only works in top level expressions. These are all valid:

exprs("{name}" := x)
tibble("{name}" := x)
But deep-forcing names isn't supported:

exprs(this(is(deep("{name}" := x))))

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.