mutate a column if it exists from possible columns

How do I mutate a column if it exists in a collection of possible columns?

I have a function that acts on a dataframe that will contain some of mean, median, se, sd, percent, but not all. Those columns are coming in as characters, and I need to convert them to numeric.

I'd like to convert any columns that match to numeric.

Here's a reprex where I failed to use any_of.

library(tidyverse)

# how to use across to mutate columns if they exist?
n <- 5

testdf <- 
tibble(
  study = letters[1:n],
  mean = rnorm(n) %>% as.character(),
  sd = runif(n) %>% as.character()
)

# this works, only specifies columns that exist
testdf %>% mutate(across(c(mean, sd), as.numeric))
#> # A tibble: 5 x 3
#>   study   mean    sd
#>   <chr>  <dbl> <dbl>
#> 1 a     -1.18  0.803
#> 2 b     -0.974 0.102
#> 3 c     -2.05  0.411
#> 4 d     -0.447 0.743
#> 5 e     -1.02  0.643

# this doesn't work because se doesn't exist
testdf %>% mutate(across(c(mean, sd, se), as.numeric))
#> Error: Problem with `mutate()` input `..1`.
#> x Can't subset columns that don't exist.
#> x Column `se` doesn't exist.
#> ℹ Input `..1` is `across(c(mean, sd, se), as.numeric)`.

# attempts to use any_of that failed
testdf %>% mutate(across(any_of(mean, sd, se), as.numeric))
#> Error: Problem with `mutate()` input `..1`.
#> x `...` is not empty.
#> 
#> We detected these problematic arguments:
#> * `..1`
#> * `..2`
#> 
#> These dots only exist to allow future extensions and should be empty.
#> Did you misspecify an argument?
#> ℹ Input `..1` is `across(any_of(mean, sd, se), as.numeric)`.

testdf %>% mutate(across(any_of(c(mean, sd, se)), as.numeric))
#> Error: Problem with `mutate()` input `..1`.
#> x object 'se' not found
#> ℹ Input `..1` is `across(any_of(c(mean, sd, se)), as.numeric)`.

Created on 2021-02-22 by the reprex package (v1.0.0)

Would this meet your needs?

testdf %>% mutate(across(matches('mean|sd|se'), as.numeric))
2 Likes

Brilliant, thank you so much!

Using "any_of" works, you just need to pass the column names as strings.

testdf %>% mutate(across(any_of(c("mean", "sd", "se")), as.numeric))

3 Likes

Aha! I swear I tried that, but you're right, so it does :slight_smile: Cheers :+1: I'm glad to make progress in understanding any_of as it seems very useful.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.