Mutate several columns (dplyr)

How can I mutate several columns at once? What I wanna do is the following: For the 11 last columns of my dataframe (df), I want to see if each row has a value smaller or equal 0. I can do this one by one column like this:

df <- df %>% mutate(p53_ = ifelse(p53 >= 0, "high", "low"))
df <- df %>% mutate(MAPK_ = ifelse(MAPK >= 0, "high", "low"))

How can I do it automatically for my 11 last columns? I tried to do it with "across", but it didn't work. I would like to add an underscore for the new columns.

Thanks for the help!

The following code processes the last four columns of a small data frame and names the new column by appending _A to the original name. I can't find a way to append only the underscore. Of course, you can rename the columns in one additional step if you want to.

library(dplyr, warn.conflicts = FALSE)
DF <- data.frame(A = rnorm(5),
                 B = rnorm(5),
                 C = rnorm(5),
                 D = rnorm(5),
                 E = rnorm(5),
                 F = rnorm(5))
DF
#>             A          B           C           D          E          F
#> 1 -0.44259056  0.9532880 -0.35058707  0.03376189  1.3553500 -0.2581047
#> 2 -1.13236871  0.3209572  0.41210003  0.55098744  0.1635376  2.7682935
#> 3 -0.89852798 -0.9860722 -0.05389367  1.04235405  1.6957571  0.8801642
#> 4 -0.95514237  1.7461401  1.33101129  1.84669557  0.8223073 -1.3766335
#> 5  0.04420605 -1.3022793  2.69105787 -0.36676794 -0.8534940 -0.9556115

HighLow <- function(V) ifelse(V <= 0, "Low", "High")
DF2 <- DF %>% mutate_at(.vars = (ncol(DF) - 3):ncol(DF), 
                        .funs = list(A = HighLow))
DF2
#>             A          B           C           D          E          F  C_A
#> 1 -0.44259056  0.9532880 -0.35058707  0.03376189  1.3553500 -0.2581047  Low
#> 2 -1.13236871  0.3209572  0.41210003  0.55098744  0.1635376  2.7682935 High
#> 3 -0.89852798 -0.9860722 -0.05389367  1.04235405  1.6957571  0.8801642  Low
#> 4 -0.95514237  1.7461401  1.33101129  1.84669557  0.8223073 -1.3766335 High
#> 5  0.04420605 -1.3022793  2.69105787 -0.36676794 -0.8534940 -0.9556115 High
#>    D_A  E_A  F_A
#> 1 High High  Low
#> 2 High High High
#> 3 High High High
#> 4 High High  Low
#> 5  Low  Low  Low

Created on 2020-08-28 by the reprex package (v0.3.0)

1 Like

Thanks, exactly what I wanted. However, I don't see where the underscore is coming from? Is it automatically added in .funs?

Yes, if .funs is passed as a named list, the new columns are named as the combination of the original column name and the name from the list separated with an underscore.

1 Like

I know this has already been answered, but as mutate_at() has been superseded, I thought I would contribute an answer using mutate(across(), which will also allow for the single underscore at the end of the new names in a single line. Using @FJCC's example data.frame above

DF3 <- DF %>%
  mutate(across(.cols = last_col() + (-3:0),
                .fns = ~if_else(. <= 0,"Low","High"),
                .names = "{.col}_"))

Explanation:
across() takes three arguments (plus ...), they are the columns we want to act on (very similar to the first argument in mutate_at(), the functions we want to apply and a new naming format for the output.

For the .cols argument, I've passed the last_col() + (-3:0), which will select the last 4 columns (and doesn't require a look back at the original dataset with ncol(.) or ncol(DF))

For .fns, I used a lambda function where the input is substituted for the ..

For the naming, I used a glue-style string where {.col} evaluates to the name of the input column for each iteration. If we were using multiple functions in a list (just like in mutate_at(), we could use {.fn} as a substitute for the name of the function in the list. Word of warning, I had to update my dplyr to V1.0.2 for this last bit to work

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.