Paste of specific variables

Hi Masters,
I have this simple df:

source <- data.frame(
  stringsAsFactors = FALSE,
                                    URN = c("aaa","bbb","ccc",
                                            "ddd","eee","fff","ggg"),
                                   Name = c("xxx","xxx","yyy",
                                            "yyy","yyy","zzzz","abcde"),
                                     Q1 = c("None.",NA,
                                            "No comments related to this exercise","Na",
                                            "N/A","Interesting comment","abc"),
                                     P2 = c("Nothing",
                                            "I have nothing in common","NA",NA,
                                            "Another comment","....?","xxxx"),
                                     Z3 = c("Service","All good",
                                            "aa","I don't know",
                                            "The final comment about that","Nothing.","na"),
                                     Q4 = c(2019,2020,2020,2019,
                                            2020,2021,2021)
                     )
source

Now, I would like to merge all string variables with a number of characters >5 ("Name" does not meet this criteria as max string length is 5) without specifying their names.
I know I can do it this way:

merged.comments <-  source %>% 
    mutate(all_comments = paste(Q1,P2,Z3, sep="/")) 
merged.comments 

but I don't want to specify variable names in paste but use mutate_if like:

mutate_if(~is.character(.) & any(nchar(.) > 5, na.rm = TRUE)

How can I do that?

maybe you can use unite() function from tidyr, something like this:

library(tidyr)
merged_comments <- source %>% 
  unite("all_comments", where(~is.character(.x) & any(nchar(.x) > 5)), sep = "/", remove = FALSE, na.rm = FALSE) # adjust na.rm argument as needed
2 Likes

Hi,
I know the task is solved and works well with the example df but when I apply that to my real dataset I can see this error:

Error: 'nchar()' requires a character vector

Do you know what should be changed? In my real data file I have a mixture of int, chr, logi, num and Factor variables...

Also, when I change this part of the code:

any(nchar(.) > 5

to a bigger number I have another error:

Error: `where()` must be used with functions that return `TRUE` or `FALSE`.

Do you know what could it be?

sorry, I didn't use the formula shorthand often, now I realized the syntax should be :

where(~is.character(.x) && any(nchar(.x) > 5)) # two &

you can refer to the documentation of where() function ?where

well, when I keep the code with

&

my error is:

Error: 'nchar()' requires a character vector

When I run it with

&&

the error is

Error: `where()` must be used with functions that return `TRUE` or `FALSE`.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.