I have a data frame with a column that contains strings with sections separated by underscores. I'd like to create two new columns, with the section after the last underscore in one column and the rest in another. The only reliable way I've found is to use rowwise() as below:
suppressMessages(library(tidyverse)) df <- tibble( x = c( "a_b", "a_b_c", "a_b_c_d", "a_b_l", "a_b_q_r_s_h" ) ) df %>% mutate( splits = strsplit(x, "_") ) %>% rowwise() %>% mutate( last_value = splits[length(splits)], the_rest = paste(splits[1:length(splits)-1], collapse = "_") ) #> Source: local data frame [5 x 4] #> Groups: <by row> #> #> # A tibble: 5 x 4 #> x splits last_value the_rest #> <chr> <list> <chr> <chr> #> 1 a_b <chr > b a #> 2 a_b_c <chr > c a_b #> 3 a_b_c_d <chr > d a_b_c #> 4 a_b_l <chr > l a_b #> 5 a_b_q_r_s_h <chr > h a_b_q_r_s
In practice I have to do this on 8 - 10 million rows of data, so using rowwise() becomes incredibly slow. I've not been able to find a reliable vectorized solution, so any ideas would be greatly appreciated!