Thank you again!
Almost that, but not exactly, I had a look at that SO answer, but it's not exactly what I need. I don't know if this method is "correct", but I've almost manged to get what I need with dplyr, stringr and tidyr.
library(dplyr, warn = FALSE)
library(tidyr)
library(stringr)
set.seed(1)
values <- c("1+1", "1+2", "1+3", "1+4", "1+5", "1+6")
Df <- tibble(
ab = sample(values, 10, replace = TRUE),
ba = sample(values, 10, replace = TRUE),
cd = sample(values, 10, replace = TRUE),
dc = sample(values, 10, replace = TRUE),
de = sample(values, 10, replace = TRUE),
ed = sample(values, 10, replace = TRUE))
Df <- Df %>%
mutate_all(
funs(case_when(
. == "1+1" | . == "1+2" ~ 1,
. == "1+3" | . == "1+4" ~ 2,
. == "1+5" | . == "1+6" ~ 3))) %>%
mutate(id = 1:10)
Df %>% gather(key = "varname", value = "values", ab:ed) %>%
mutate(has_letter = case_when(
str_detect(varname, "a") ~ "has_a",
str_detect(varname, "c") ~ "has_c",
str_detect(varname, "e") ~ "has_e"
)) %>%
group_by(id, has_letter) %>%
summarise(maxval = max(values)) %>%
spread(key = has_letter, value = maxval) %>%
#Is it possible optimize the following case_when so I don't have to write out all possible conditions?
#I need the NAME of the column with the highest value, if more than one, I can combine, e.g "ac", "ce".
#The original dataset also has NA in some cols, so I would need a lot of different conditions with this method (I think...)
mutate(maxcolname = case_when(
has_a == has_c & has_a == has_e ~ "equal",
has_a > has_c & has_a > has_e ~ "a"
))
I've added a few comments in the reprex where I'm stuck...