Elegant and shorter way of recoding

Hi R Masters,
I have following simple df:


df <- data.frame(
  stringsAsFactors = FALSE,
               URN = c("aaa", "bbb", "ccc"),
               Q9a = c("Exellant", "!", "Honest"),
               Q9b = c("x", NA, "Trustworthy"),
               Q9c = c("Friendly", "Professional", NA)
)

I would like to recode all NAs and short characters into "Blank" and other into "Comment".
I know I can do it using this code:

library(dplyr)
library(stringr)

df <- df %>% 
  mutate(Q9a.Blank = case_when(is.na(Q9a) ~ 1,
                               str_length(Q9a) < 2 ~ 1),
         Q9b.Blank = case_when(is.na(Q9b) ~ 1,
                               str_length(Q9b) < 2 ~ 1),
         Q9c.Blank = case_when(is.na(Q9c) ~ 1,
                               str_length(Q9c) < 2 ~ 1))

df$Q9a.Blank <- recode_factor(df$Q9a.Blank, `1` = "Blank", .missing = "Comment")
df$Q9b.Blank <- recode_factor(df$Q9b.Blank, `1` = "Blank", .missing = "Comment")
df$Q9c.Blank <- recode_factor(df$Q9c.Blank, `1` = "Blank", .missing = "Comment")

But I am sure there is more elegant and much shorter code considering that all variables taken into account have ".Blank" in their names.

Can you help?

The new dplyr v1.0.* has support for a new syntax for mutate which might look a bit like this in the second part of your case:

df %>%
  mutate(across(ends_with(".Blank"), ~ recode_factor(`1` = "Blank", .missing = "Comment")))

Answering quickly but I expect you could make something similar for the first half!

Maybe something like this

df %>% 
  mutate(across(-URN, ~ifelse(is.na(.) | str_length(.) < 2, 'Blank', 'Comment'))) %>% 
  set_names(c(names(df[1]), paste0(names(df[-1]), '.Blank'))) %>% 
  inner_join(df, .)
1 Like

Thank you very much but the tricky part in the first part is not simply selecting variables ending with ".Blank" but also selecting relevant variables (a, b or c). I don't know how I could do that

library(dplyr)
library(stringr)
df <- data.frame(
  stringsAsFactors = FALSE,
  URN = c("aaa", "bbb", "ccc"),
  Q9a = c("Exellant", "!", "Honest"),
  Q9b = c("x", NA, "Trustworthy"),
  Q9c = c("Friendly", "Professional", NA)
)


caseblank <- function(x){
  case_when(is.na(x) ~ 1,
            str_length(x) < 2 ~ 1)
}


df <- df %>% 
  mutate(across(starts_with("Q9"),
                ~caseblank(.),.names = "{.col}.blank"))

blankf <- function(x){
  recode_factor(x, `1` = "Blank", .missing = "Comment")
}

df %>%  mutate(across(ends_with("blank"), blankf))
2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.