Here's what I have as some raw data:
df <- tibble(x = c("a","b","c"),
id = c("x|y", "x", "x|y|z"),
name = c("i|j","i",'i|j|k'))
Here's what I want;
df_want <- tibble(x = c("a","a","b","c","c","c"),
id = c("x", "y", "x", "x", "y", "z"),
name = c("i","j", "i", "i", "j", "k"))
I've done it using a bunch of pivots...but I'd be surprised if this is the best way:
not_all_na <- function(x) {!all(is.na(x))}
df %>%
separate(id, into = paste0("ID_",letters),
sep = "\\|") %>%
separate(name, into = paste0("Name_",letters),
sep = "\\|") %>%
select_if(not_all_na) %>%
pivot_longer(cols = -x,
names_to = c("element", ".value"),
names_pattern = "(.+)_(.+)") %>%
pivot_longer(cols = c(-x, -element),
names_to = "letter",
values_to = "value") %>%
filter(!is.na(value)) %>%
pivot_wider(id_cols = c(x, letter),
names_from = element,
values_from = value) %>%
select(-letter)
I was trying to figure out if there's a way to use group_by
and have the summarize output be a bind_row of the separated characters, but I haven't gotten that to work yet. I'm not fluent in purrr...so I bet that is the key. Anyway, if anyone has any suggestions, I'd love to simplify this task.