Iterate replace_all() over columns

eva4 · October 21, 2022, 4:43pm

Population data set, k's, M's and dots, I removed for '1799' using this line:

p$'1799' <- str_replace_all(p$'1799', c("(\d+)k" = "\1000", "(\d+)M" = "\1000000","(\d+)\." = "" ))

How to I iterate over the other years?

I was thinking for-loop, but it does not work:

p_new <- p
for(i in 2:ncol(p_new)) {
p_new[ , i] <- str_replace_all(p_new[ , i], c("(\d+)k" = "\1000", "(\d+)M" = "\1000000","(\d+)\." = "" ))
}
p_new

I have to do this before pivoting the dataset.
All suggestions are welcome.

FJCC · October 21, 2022, 5:13pm

You can use the mutate() and across() functions from the dplyr package.

DF <- data.frame(country = c("A", "B"),
                 `1799` = c("8M", "3k"),
                 `1800` = c("2400", "75M"),
                 `1801` = c("6k", "23"),check.names = FALSE)
DF
#>   country 1799 1800 1801
#> 1       A   8M 2400   6k
#> 2       B   3k  75M   23
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)
DF <- DF |> mutate(across(.cols = -country, 
                          .fns = ~str_replace_all(.x, c("(\\d+)k" = "\\1000", 
                                                        "(\\d+)M" = "\\1000000",
                                                        "(\\d+)\\." = "" ))))
DF
#>   country    1799     1800 1801
#> 1       A 8000000     2400 6000
#> 2       B    3000 75000000   23

^{Created on 2022-10-21 with reprex v2.0.2}

system · October 28, 2022, 5:13pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.