Rename columns programmatically

Hi there,
I am trying to rename columns programmatically. I have several numbered columns of two types (DIAG_ and OPER_) which are created from a pivot_wider call. They need to have leading zeroes for later use, but don't when they are reshaped.

I have been trying this approach, but it doesn't do what is intended:

dat <- tibble(DIAG_1 = runif(10), DIAG_2 = runif(10), DIAG_3 = runif(10), DIAG_4 = runif(10), OPER_1 = runif(10), OPER_2 = runif(10), OPER_3 = runif(10), OPER_4 = runif(10))
prefixes <- c('DIAG_', 'OPER_')
prefixes %>% 
  map(grep, x=names(dat)) %>% 
    map2(prefixes, ~function(x, y) names(dat)[x] <- paste0(y, str_pad(1:length(x), width = 2, pad = '0')))

Any help on how to add leading zeroes to various auto-numbered columns would be gratefully accepted!

Regards,
Will

Thanks for the reprex, Will.

suppressPackageStartupMessages(library(stringr))
suppressPackageStartupMessages(library(tibble)) 
dat <-
  tibble(
    DIAG_1 = runif(10),
    DIAG_2 = runif(10),
    DIAG_3 = runif(10),
    DIAG_4 = runif(10),
    OPER_1 = runif(10),
    OPER_2 = runif(10),
    OPER_3 = runif(10),
    OPER_4 = runif(10)
  )
i_have <- "(\\p{Uppercase}{4}_\\d)"
i_want <- "0000\\1"
colnames(dat) -> ionic
str_replace(ionic,i_have,i_want)
#> [1] "0000DIAG_1" "0000DIAG_2" "0000DIAG_3" "0000DIAG_4" "0000OPER_1"
#> [6] "0000OPER_2" "0000OPER_3" "0000OPER_4"

Created on 2020-04-09 by the reprex package (v0.3.0)

1 Like

Hi Will,

Good case for using rename_if:

should_prefix <- function(x) any(stringr::str_detect(x, c("DIAG", "OPER")))
apply_prefix <- function(colname, prefix) paste0(prefix, colname)

dat_newcolnames <- dat %>%
  dplyr::rename_if(
    map_lgl(names(.), should_prefix),
    ~apply_prefix(., "000")
  )

Although, the interface for rename_if is a bit weird. For the predicate argument I had to pass it a logical vector with length ncol(dat); it didn't work as expected when I used a function object in the predicate argument.

Also, not sure if this is relevant since I haven't seen your exact code, but pivot_wider does have a 'names_prefix' argument. If it would work in your actual situation to add names_prefix = "000" then that's probably the way to go.

2 Likes

Thanks for the suggestions, but I think maybe I should have specified more clearly the result I was looking for? The original variables are auto-numbered like this:
DIAG_1, DIAG_2, DIAG_3, DIAG_4, DIAG_5, DIAG_6, DIAG_7, DIAG_8, DIAG_9, DIAG_10
What I need is the leading zero on the number...
DIAG_01, DIAG_02, DIAG_03, DIAG_04, DIAG_05, DIAG_06, DIAG_07, DIAG_08, DIAG_09, DIAG_10

But yes, they are both good solutions, and I can see how they would work. Thanks very much Technocrat and Gabriel!!!

Regards,
Will

1 Like
i_want <- "\\1000"

Ahhh I'm with you. In retrospect that makes more sense. In that case I'd say

should_add_leading_zeroes <- function(x) {
  if(!stringr::str_detect(x, "_")) return(FALSE)
  
  split_x <- stringr::str_split(x, "_")[[1]]
  is_right_prefix <- any(stringr::str_detect(split_x[[1]], c("DIAG", "OPER")))
  is_less_than_ten <- as.integer(split_x[[2]]) < 10L
  
  return(is_right_prefix & is_less_than_ten)
}

dat_newcolnames <- dat %>%
  dplyr::rename_if(
    map_lgl(names(.), should_add_leading_zeroes),
    ~stringr::str_replace(., "_", "_0")
  )

Cheers,

1 Like

I have also reformulated a bit based on an idea I got from your suggestions:

df <- data.frame(DIAG_1 = 1:5, DIAG_2 = 1:5, DIAG_3 = 1:5, DIAG_10 = 1:5)
nms <- names(df)
prefix <- str_sub(nms, 1, 5)
second_last_char <- str_sub(nms, -2, -2)
last_char <- str_sub(nms, -1, -1)
newnms <-  ifelse(second_last_char == "_", paste0(prefix, "0", last_char), nms)
names(df) <- newnms

Simple approach, but obviously depends on the consistency of single numbered variables being the only ones with an underscore as second from last character.

All the best,
Will

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.