Let's say I have a tibble named my_data.
my_data = tibble(row_number = c("row1,row2", "row3,row4,row5", "row6,row7"),
variable1 = rnorm(3),
variable2 = rnorm(3))
I want to split each row of my_data into multiple rows so that each row has only one row index in the row_number column. Below is my code to do this.
my_data2 = vector(mode = "list", length = nrow(my_data))
for (i in 1:nrow(my_data)) {
my_data2[[i]] = tibble(row_number = my_data$row_number[i] %>% str_split(",") %>% flatten_chr,
variable1 = my_data$variable1[i],
variable2 = my_data$variable2[i])
}
my_data2 %<>% bind_rows
Although this code does exactly what I want, it takes too long when my_data is very large (my actual data frame has hundreds of thousands of rows). Is there a more efficient way to do this?