Trouble with creating column names from a passed argument in function

Hi, I'm stumped and tried reading the tidyverse programming vignette. I'm trying to change a column name in a dataframe being created within a function. It happens to be an argument for the function and numeric. Thanks for any insights.

df <- data_frame(a = seq(1,10,by=1))
calculate_exp <- function(df, value) {
  df %>%
    mutate(as.name(value) = a^value)
}
calculate_exp(df, 2)

And I get the following error:

#> Error: :4:27: unexpected '='
#> 3: df %>%
#> 4: mutate(as.name(value) =
#> ^

I am not totally sure I understand what you are trying to do. But if you want to write a function that would create a new column called a^value which would raise a by the power of value, you could do this:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(a^value)
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a `a^value`
#>    <int>     <dbl>
#>  1     1         1
#>  2     2         4
#>  3     3         9
#>  4     4        16
#>  5     5        25
#>  6     6        36
#>  7     7        49
#>  8     8        64
#>  9     9        81
#> 10    10       100

I actually just spent the better part of the morning figuring out how to do something quite like this. The tool for the job you're trying to do is "quasiquotation", and there are lots of resources about it. I'm not a quasiquotation or tidyeval expert, so there might be a better way to do this, but here's my solution:

library(dplyr)
library(rlang)
df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  colName = paste0("a^", quo_name(value))
  df %>%
    mutate(!!quo_name(colName) := a^value)
}

calculate_exp(df, 3)
# A tibble: 10 x 2
#       a `a^3`
#  <int> <dbl>
# 1     1     1
# 2     2     8
# 3     3    27
# 4     4    64
# 5     5   125
# 6     6   216
# 7     7   343
# 8     8   512
# 9     9   729
# 10    10  1000
3 Likes

Oh, now I see what the problem was lol. I thought I was not understanding the question because the answer was too easy :smile:

1 Like

To explain a little more:

  • quo_name gets at the value that value represents and then returns it as a string
  • := is necessary if you're going to have a quosure on the left hand side of an equality/assignment like that
  • !! "unquotes" the quosure. But, in particular, it tell mutate to not quote colName and have the name of the new column literally be "colName"`
3 Likes

I think your first quo_name is unnecessary:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  colName = paste0("a^", value)
  df %>%
    mutate(!!quo_name(colName) := a^value)
}

calculate_exp(df, 3)
#> # A tibble: 10 x 2
#>        a `a^3`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     8
#>  3     3    27
#>  4     4    64
#>  5     5   125
#>  6     6   216
#>  7     7   343
#>  8     8   512
#>  9     9   729
#> 10    10  1000
1 Like

Doh, not sure how I missed that.

I didn't even understand the point of the question at first, so don't worry :smile: Your explanation and solution are great.

1 Like

Now, I am not fully sure what @bing wants as the name of the newly created variable. Their attempt makes me wonder if maybe they only wanted this????

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(!!quo_name(value) := a^value)
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a   `2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100
2 Likes

Maybe! Though hopefully now they recognize how to turn strings from function arguments into column names with dplyr as a general pattern.

2 Likes

Actually, using rlang is not necessary:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(a^value) %>%
    setNames(c("a", value))
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a   `2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

And the solution with the nicer a^value instead of value name:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(a^value) %>%
    setNames(c("a", paste0("a^",value)))
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a `a^2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

But all our solutions involve some hard coding (the worst in this respect being my last ones).

To make the function general and get rid of this hard coding, this is nicer:

library(tidyverse)

calculate_exp <- function(dat, var, value) {
  dat %>%
    mutate(!!quo_name(value) := var^value)
}

df <- data_frame(a = 1:10)

calculate_exp(df, df$a, 2)
#> # A tibble: 10 x 2
#>        a   `2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

Now the function works with any data frame, any variable of that data frame, and any value.

2 Likes

Just playing around here to try to understand tidyeval a bit better and it led me to this modification that allows you to pass an unquoted variable name to the function (as most tidyverse functions allow) as well as to name the new column with the exponential expression used to create the new column.

library(tidyverse)

calculate_exp <- function(dat, var, value) {
  var <- enquo(var)
  exp_name <- paste0(quo_name(var), "^", quo_name(value))

  dat %>%
    mutate(!!exp_name := `^`(!!var, value))
  }

df <- data_frame(a = 1:10)

calculate_exp(df, a, 2)
#> # A tibble: 10 x 2
#>        a `a^2`
#>    <int> <dbl>
#>  1     1    1.
#>  2     2    4.
#>  3     3    9.
#>  4     4   16.
#>  5     5   25.
#>  6     6   36.
#>  7     7   49.
#>  8     8   64.
#>  9     9   81.
#> 10    10  100.

Created on 2018-04-28 by the reprex package (v0.2.0).

2 Likes

Thank you everybody for the constructive discussion about better understanding NSE and quasiquoting. Unfortunately, despite being the least intuitive to beginners and requires a great deal of study of the vignette, I believe that rlang is the best approach with the quosures for the most general solution and the safest in function. setNames is nice but cumbersome outside of trivial examples. And see more examples of paste/paste0 is always helpful.

The generalization of the function to any data frame is particularly clever for any nontrivial data frame.

Brilliant :slight_smile:

This has been a really fun cooperative effort with everyone making the answer a little better :slight_smile:

My turn to try variations to learn and try to make the expression slightly shorter:

library(tidyverse)

calculate_exp <- function(dat, var, value) {
  var <- enquo(var)
  dat %>%
    mutate(!!paste(quo_name(var), quo_name(value), sep = "^") := (!!var)^value)
}

df <- data_frame(a = 1:10)

calculate_exp(df, a, 2)
#> # A tibble: 10 x 2
#>        a `a^2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

It is interesting to see that quo_name(enquo(var)) will not work.