Hi Chris, since the replacements for 1 are complex, I suggest you build a lookup dictionary first, which is basically a data.frame, and then use join to match these simplified stages. The look-up dictionary is like:
stage_look_up <- list(
`1` = c("1MI", "c1MI", "1", "1A", "1A1", "1A2", "1B1"),
`2` = c("2MI", "c2MI", "2", "2A", "2A1", "2A2", "2B1")
) %>% enframe %>% unnest(value) %>%
rename(TNM_CLIN_T = value, stage.simplify = name)
> stage_look_up
# A tibble: 14 x 2
stage.simplify TNM_CLIN_T
<chr> <chr>
1 1 1MI
2 1 c1MI
3 1 1
4 1 1A
5 1 1A1
6 1 1A2
7 1 1B1
8 2 2MI
9 2 c2MI
10 2 2
11 2 2A
12 2 2A1
13 2 2A2
14 2 2B1
then join the dictionary do the lookup:
df %>% left_join(stage_look_up,by = 'TNM_CLIN_T')
# A tibble: 8 x 2
TNM_CLIN_T stage.simplify
<chr> <chr>
1 blank NA
2 cX NA
3 blank NA
4 c4 NA
5 c3 NA
6 c2 NA
7 1A2 1
8 c1 NA
I didn't set those simplified stages for "c1", "c2" ..., so it remains many NA in the result. You may have to finish setting them.