applying mutate(), across(), starts_with(), parse_number() all together

How do I convert characters to numeric, if any, across variables that start with "dk_"?

library(dplyr)
library(readr)

# Toy data -------------------
df1 <- tibble(
  dk_1 = as.character(1:5),
  dk_2 = 9:13,
  dk_44 = 100:104,
  mk = as.character(1:5)
)
df1
#> # A tibble: 5 x 4
#>   dk_1   dk_2 dk_44 mk   
#>   <chr> <int> <int> <chr>
#> 1 1         9   100 1    
#> 2 2        10   101 2    
#> 3 3        11   102 3    
#> 4 4        12   103 4    
#> 5 5        13   104 5

# all numeric
df2 <- tibble(
  dk_1 = 1:5,
  dk_2 = 9:13,
  dk_44 = 100:104,
  mk = as.character(1:5)
)
df2
#> # A tibble: 5 x 4
#>    dk_1  dk_2 dk_44 mk   
#>   <int> <int> <int> <chr>
#> 1     1     9   100 1    
#> 2     2    10   101 2    
#> 3     3    11   102 3    
#> 4     4    12   103 4    
#> 5     5    13   104 5

# My attempt---------------------
df1 %>%
  mutate(across(starts_with("dk_"), parse_number))
#> Error: Problem with `mutate()` input `..1`.
#> i `..1 = across(starts_with("dk_"), parse_number)`.
#> x is.character(x) is not TRUE

df2 %>%
  mutate(across(starts_with("dk_"), parse_number))
#> Error: Problem with `mutate()` input `..1`.
#> i `..1 = across(starts_with("dk_"), parse_number)`.
#> x is.character(x) is not TRUE

Created on 2022-01-18 by the reprex package (v2.0.1)

Have you loaded library(readr) that contains parse_number()?

Just tried. Unfortunately, it still does not work.

I think you need to test that the column is character if you use parse_number.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(readr)

# Toy data -------------------
df1 <- tibble(
  dk_1 = as.character(1:5),
  dk_2 = 9:13,
  dk_44 = 100:104,
  mk = as.character(1:5)
)
df1
#> # A tibble: 5 x 4
#>   dk_1   dk_2 dk_44 mk   
#>   <chr> <int> <int> <chr>
#> 1 1         9   100 1    
#> 2 2        10   101 2    
#> 3 3        11   102 3    
#> 4 4        12   103 4    
#> 5 5        13   104 5

# all numeric
df2 <- tibble(
  dk_1 = 1:5,
  dk_2 = 9:13,
  dk_44 = 100:104,
  mk = as.character(1:5)
)
df2
#> # A tibble: 5 x 4
#>    dk_1  dk_2 dk_44 mk   
#>   <int> <int> <int> <chr>
#> 1     1     9   100 1    
#> 2     2    10   101 2    
#> 3     3    11   102 3    
#> 4     4    12   103 4    
#> 5     5    13   104 5

# My attempt---------------------
df1 %>%
  mutate(across(where(is.character) & starts_with("dk_"), parse_number))
#> # A tibble: 5 x 4
#>    dk_1  dk_2 dk_44 mk   
#>   <dbl> <int> <int> <chr>
#> 1     1     9   100 1    
#> 2     2    10   101 2    
#> 3     3    11   102 3    
#> 4     4    12   103 4    
#> 5     5    13   104 5

df2 %>%
  mutate(across(where(is.character) & starts_with("dk_"), parse_number))
#> # A tibble: 5 x 4
#>    dk_1  dk_2 dk_44 mk   
#>   <int> <int> <int> <chr>
#> 1     1     9   100 1    
#> 2     2    10   101 2    
#> 3     3    11   102 3    
#> 4     4    12   103 4    
#> 5     5    13   104 5

Created on 2022-01-18 by the reprex package (v2.0.1)

df1 %>%
  mutate_at(vars(starts_with("dk_")), as.numeric)

Or as.integer

Hope t helps,
Grzegorz

@FJCC Many thanks! Just what I wanted :smiley:

@gsapijaszko thanks! It works, but note that mutate_at() has been superseded by the use of across(). See Mutate multiple columns — mutate_all • dplyr .

Sure, however you asked for specific "variables that start with "dk_"?" :slight_smile:

Regards,
Grzegorz

@gsapijaszko See the solution; it also makes the use of starts_with().

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.