Trouble with creating column names from a passed argument in function

bing · April 27, 2018, 5:28pm

Hi, I'm stumped and tried reading the tidyverse programming vignette. I'm trying to change a column name in a dataframe being created within a function. It happens to be an argument for the function and numeric. Thanks for any insights.

df <- data_frame(a = seq(1,10,by=1))
calculate_exp <- function(df, value) {
  df %>%
    mutate(as.name(value) = a^value)
}
calculate_exp(df, 2)

And I get the following error:

#> Error: :4:27: unexpected '='
#> 3: df %>%
#> 4: mutate(as.name(value) =
#> ^

prosoitos · April 27, 2018, 11:03pm

I am not totally sure I understand what you are trying to do. But if you want to write a function that would create a new column called a^value which would raise a by the power of value, you could do this:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(a^value)
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a `a^value`
#>    <int>     <dbl>
#>  1     1         1
#>  2     2         4
#>  3     3         9
#>  4     4        16
#>  5     5        25
#>  6     6        36
#>  7     7        49
#>  8     8        64
#>  9     9        81
#> 10    10       100

dstander · April 27, 2018, 11:29pm

I actually just spent the better part of the morning figuring out how to do something quite like this. The tool for the job you're trying to do is "quasiquotation", and there are lots of resources about it. I'm not a quasiquotation or tidyeval expert, so there might be a better way to do this, but here's my solution:

library(dplyr)
library(rlang)
df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  colName = paste0("a^", quo_name(value))
  df %>%
    mutate(!!quo_name(colName) := a^value)
}

calculate_exp(df, 3)
# A tibble: 10 x 2
#       a `a^3`
#  <int> <dbl>
# 1     1     1
# 2     2     8
# 3     3    27
# 4     4    64
# 5     5   125
# 6     6   216
# 7     7   343
# 8     8   512
# 9     9   729
# 10    10  1000

prosoitos · April 27, 2018, 11:30pm

Oh, now I see what the problem was lol. I thought I was not understanding the question because the answer was too easy

dstander · April 27, 2018, 11:33pm

To explain a little more:

quo_name gets at the value that value represents and then returns it as a string
:= is necessary if you're going to have a quosure on the left hand side of an equality/assignment like that
!! "unquotes" the quosure. But, in particular, it tell mutate to not quote colName and have the name of the new column literally be "colName"`

prosoitos · April 27, 2018, 11:55pm

I think your first quo_name is unnecessary:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  colName = paste0("a^", value)
  df %>%
    mutate(!!quo_name(colName) := a^value)
}

calculate_exp(df, 3)
#> # A tibble: 10 x 2
#>        a `a^3`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     8
#>  3     3    27
#>  4     4    64
#>  5     5   125
#>  6     6   216
#>  7     7   343
#>  8     8   512
#>  9     9   729
#> 10    10  1000

dstander · April 27, 2018, 11:57pm

Doh, not sure how I missed that.

prosoitos · April 27, 2018, 11:58pm

I didn't even understand the point of the question at first, so don't worry Your explanation and solution are great.

prosoitos · April 28, 2018, 12:04am

Now, I am not fully sure what @bing wants as the name of the newly created variable. Their attempt makes me wonder if maybe they only wanted this????

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(!!quo_name(value) := a^value)
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a   `2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

dstander · April 28, 2018, 12:08am

Maybe! Though hopefully now they recognize how to turn strings from function arguments into column names with dplyr as a general pattern.

prosoitos · April 28, 2018, 2:05am

Actually, using rlang is not necessary:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(a^value) %>%
    setNames(c("a", value))
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a   `2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

prosoitos · April 28, 2018, 2:07am

And the solution with the nicer a^value instead of value name:

library(tidyverse)

df <- data_frame(a = 1:10)

calculate_exp <- function(df, value) {
  df %>%
    mutate(a^value) %>%
    setNames(c("a", paste0("a^",value)))
}

calculate_exp(df, 2)
#> # A tibble: 10 x 2
#>        a `a^2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

prosoitos · April 28, 2018, 2:38am

But all our solutions involve some hard coding (the worst in this respect being my last ones).

To make the function general and get rid of this hard coding, this is nicer:

library(tidyverse)

calculate_exp <- function(dat, var, value) {
  dat %>%
    mutate(!!quo_name(value) := var^value)
}

df <- data_frame(a = 1:10)

calculate_exp(df, df$a, 2)
#> # A tibble: 10 x 2
#>        a   `2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

Now the function works with any data frame, any variable of that data frame, and any value.

mfherman · April 28, 2018, 11:50am

Just playing around here to try to understand tidyeval a bit better and it led me to this modification that allows you to pass an unquoted variable name to the function (as most tidyverse functions allow) as well as to name the new column with the exponential expression used to create the new column.

library(tidyverse)

calculate_exp <- function(dat, var, value) {
  var <- enquo(var)
  exp_name <- paste0(quo_name(var), "^", quo_name(value))

  dat %>%
    mutate(!!exp_name := `^`(!!var, value))
  }

df <- data_frame(a = 1:10)

calculate_exp(df, a, 2)
#> # A tibble: 10 x 2
#>        a `a^2`
#>    <int> <dbl>
#>  1     1    1.
#>  2     2    4.
#>  3     3    9.
#>  4     4   16.
#>  5     5   25.
#>  6     6   36.
#>  7     7   49.
#>  8     8   64.
#>  9     9   81.
#> 10    10  100.

Created on 2018-04-28 by the reprex package (v0.2.0).

bing · April 28, 2018, 2:58pm

Thank you everybody for the constructive discussion about better understanding NSE and quasiquoting. Unfortunately, despite being the least intuitive to beginners and requires a great deal of study of the vignette, I believe that rlang is the best approach with the quosures for the most general solution and the safest in function. setNames is nice but cumbersome outside of trivial examples. And see more examples of paste/paste0 is always helpful.

The generalization of the function to any data frame is particularly clever for any nontrivial data frame.

prosoitos · April 28, 2018, 5:35pm

Brilliant

This has been a really fun cooperative effort with everyone making the answer a little better

prosoitos · April 28, 2018, 5:56pm

My turn to try variations to learn and try to make the expression slightly shorter:

library(tidyverse)

calculate_exp <- function(dat, var, value) {
  var <- enquo(var)
  dat %>%
    mutate(!!paste(quo_name(var), quo_name(value), sep = "^") := (!!var)^value)
}

df <- data_frame(a = 1:10)

calculate_exp(df, a, 2)
#> # A tibble: 10 x 2
#>        a `a^2`
#>    <int> <dbl>
#>  1     1     1
#>  2     2     4
#>  3     3     9
#>  4     4    16
#>  5     5    25
#>  6     6    36
#>  7     7    49
#>  8     8    64
#>  9     9    81
#> 10    10   100

It is interesting to see that quo_name(enquo(var)) will not work.