Create new variable in a dataframe through a function

Hi all,

I looked for similar topics in the forum but could not find any response so apologies if this has ben covered... And apologies if the solution is simple.

So I have a simple code to recode a variable in my data frame (basically I have a 1-10 variable called "x1" and I am recoding in into a 5-point scale categorical one:

y <- c("Very low", "Low", "Average", "High", "Very high")
df <- df %>%
mutate(df, x1= ifelse(x1 %in% 1:2, y[1],
                         ifelse(x1 %in% 3:4,y[2],
                                ifelse(x1 %in% 5:6,y[3],
                                       ifelse(x1 %in% 7:8, y[4], y[5])))))

Thus far, all works well.

However, I have other variables x1, x2 etc. and I would like to create a function to recode these variables automatically in my df. I wrote the following:

lab_lh <- function(var) {
  y <- c("Very low", "Low", "Average", "High", "Very high")
  df<- df%>%
    mutate(var = ifelse(var %in% 1:2, y[1],
                                ifelse(var %in% 3:4,y[2],
                                ifelse(var %in% 5:6,y[3],
                                ifelse(var %in% 7:8, y[4], y[5])))))
  }
lab_lh(df$x1) #I added the df because of the masking issues within functions and I did not manage to apply the solutions offered in other topics.

Is there a simple way to solve this?

Many thanks
Mary

Hello @maryog ,

try this:

suppressPackageStartupMessages(
  suppressWarnings(
    {
      library(dplyr)
    }
  )
)

mydf <- data.frame(
  x = 1:10
)


lab_lh <- function(df1,var) {
  y <- c("Very low", "Low", "Average", "High", "Very high")
  df1 %>%
    mutate(var = ifelse({{var}} %in% 1:2, y[1],
                                ifelse({{var}} %in% 3:4,y[2],
                                ifelse({{var}} %in% 5:6,y[3],
                                ifelse({{var}} %in% 7:8, y[4], y[5])))))
}

lab_lh(mydf,x)
#>     x       var
#> 1   1  Very low
#> 2   2  Very low
#> 3   3       Low
#> 4   4       Low
#> 5   5   Average
#> 6   6   Average
#> 7   7      High
#> 8   8      High
#> 9   9 Very high
#> 10 10 Very high
Created on 2022-04-11 by the reprex package (v2.0.1)

It's worth using case_when() instead of multiple ifelse() clauses:

Thanks @HanOostdijk . The issue is that I only get an output with "var" created as an additional column (instead of replacing my x1) but my dataframe is unchanged.

the code only requires a small change to implement that:

lab_lh <- function(df1,var) {
  y <- c("Very low", "Low", "Average", "High", "Very high")
  df1 %>%
    mutate({{var}} := ifelse({{var}} %in% 1:2, y[1],
                        ifelse({{var}} %in% 3:4,y[2],
                               ifelse({{var}} %in% 5:6,y[3],
                                      ifelse({{var}} %in% 7:8, y[4], y[5])))))
}

use {{var}} to set the resulting name of the variable being made to whatever was passed in as var and := as a special symbol to make that work.

Thanks all,

I added a few tweaks to the suggestions above as the code did not permit me to change the variable in the data frame (it was only changed in the function output).

Here's what I got:

lab_lh <- function(var, y) {
  var <- ifelse(var %in% 1:2, y[1],
         ifelse(var %in% 3:4, y[2],
         ifelse(var %in% 5:6, y[3],
         ifelse(var %in% 7:8, y[4], y[5]))))
  factor({{var}}, levels= y)
}

And then, applied to a given list of categories (argument y in the function) and a list of variables:

var_list <- df %>% 
  select(x1:x3)
cat<- c("Very low", "Low", "Average", "High", "Very high")
D1 <- as.data.frame(lapply(var_list, lab_lh, y = cat))

And the whole thing works now!

Many thanks for your help!
Mary

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.