 df |> mutate( {{x}} := as.factor( {{x}} )) not working as intended

Original column is numeric. It doesn't seem to convert to factor, but instead makes all the values = column name. Why does the LHS recognise it correctly, but not the RHS?

df |>
mutate( {{x}} := as.factor( {{x}} ))

I think we might need a bit more info than what you've given - is this in a function? As this works:

library(tidyverse)

myfunc = function(df, x){

df |>
mutate( {{x}} := as.factor( {{x}} ))

}

myfunc(tibble(mtcars), carb)
# A tibble: 32 x 11
mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
1  21       6  160    110  3.9   2.62  16.5     0     1     4 4
2  21       6  160    110  3.9   2.88  17.0     0     1     4 4
3  22.8     4  108     93  3.85  2.32  18.6     1     1     4 1
4  21.4     6  258    110  3.08  3.22  19.4     1     0     3 1
5  18.7     8  360    175  3.15  3.44  17.0     0     0     3 2
6  18.1     6  225    105  2.76  3.46  20.2     1     0     3 1
7  14.3     8  360    245  3.21  3.57  15.8     0     0     3 4
8  24.4     4  147.    62  3.69  3.19  20       1     0     4 2
9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4 2
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4 4
# ... with 22 more rows

Hmm I see that.

library(tidyverse)

ddff = 'tibble(mtcars)'
xx = 'carb'

myfunc = function(df, x){

{{ df }} |>
mutate( {{x}} := as.factor( {{x}} ))

}

myfunc(ddff, xx)

Something like this is what I would like.

Ah I see! This is challenging, but there's a pretty good article here:

This will run.

library(tidyverse)

ddff = 'tibble(mtcars)'
xx = 'carb'

myfunc = function(df, x){
x = sym(x)
eval(parse(text=df)) |>
mutate({{x}} := as.factor({{x}}))
}

myfunc(ddff, xx)

The top line evaluates whatever string you provide as "df" as R code.

The "x" argument - again provided as a string - turns the string into a symbol, which can then be parsed by the curly-curly-enclosed variable names inside of the function.

1 Like

I have additional details. It seems that similar methods won't work for recipes. I tried messing around with the sym(x), x, {{ x }}, and !!sym(x). The link you have provided (and after searching a bit online) doesn't seem to contain explicit information on recipes for functional programming purposes.

df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))

myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = dataa) |>
step_tokenize(!!sym(x)) |> # tried some variations
prep() |>
bake(NULL) |>
View()
}

ddff = 'df'
label = 'xx'

myfunc(ddff, label)

#Can't convert <textrecipes_tokenlist> to <character>.
#Run `rlang::last_error()` to see where the error occurred.

I think your issue is that you haven't evaluated your data frame. This runs:

library(tidymodels)
library(textrecipes)

df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))

myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = eval(parse(text=dataa))) |>
step_tokenize(!!sym(x)) |>
prep() |>
bake(NULL)

return(recc)
}

ddff = 'df'
label = 'xx'

myfunc(ddff, label)

Though this comes with the caveat that I don't really know what step_tokenize is meant to do! Should it be label = "text" rather than "xx"?

1 Like

Oh oops, I think evaluation of prep() might have skipped there. This might be the better reprex.

library(tidymodels)
library(textrecipes)
rm(list = ls())

df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))

myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = eval(parse(text=dataa))) |>
step_tokenize(!!sym(x))

return(recc)
}

ddff = 'df'
label = 'xx'

myfunc(ddff, label) |> prep() |> bake(NULL) |> View()
# Error: Can't convert <textrecipes_tokenlist> to <character>.

I think !!sym(x) is not being converted properly.

I think the issue might literally just be the use of View(). When you stop the pipe at bake() it looks to me like the code runs okay, the RStudio viewing pane can't seem to display a "tknlist".

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from
#>   required_pkgs.model_spec parsnip
library(textrecipes)
rm(list = ls())

df = data.frame(text = c('blah blah blah', 'hello, hi, welcome', 'what, why'),
xx = as.factor(c('1', '0', '1')))

myfunc = function(dataa, x){
recc = recipe(as.formula(paste0(x, ' ~ text')), data = eval(parse(text=dataa))) |>
step_tokenize(!!sym(x))

return(recc)
}

ddff = 'df'
label = 'text'

myfunc(ddff, label) |> prep() |> bake(NULL)
#> # A tibble: 3 x 1
#>         text
#>    <tknlist>
#> 1 [3 tokens]
#> 2 [3 tokens]
#> 3 [2 tokens]

Created on 2021-12-02 by the reprex package (v2.0.1)

just some friendly advice... this is a habit best avoided... consequences might be that Jenny Bryan will set your computer on fire ! Project-oriented workflow (tidyverse.org)