regex, delete after -

Hello,
I am a beginner with R and regex are very hard for me.
I have a vector of characters :
names <- c("TCGA-44-3919-01A-02R-1107-07", "TCGA-44-2668-01A-01R-A278-07","TCGA-49-AAQV-01A-11R-A39D-07")

and I want an output without the last characters after the 3rd -:

names_output <- c("TCGA-44-3919", "TCGA-44-2668","TCGA-49-AAQV")

3 hours I am trying without success...

Regards

Simon

You can do this:

names_input <- c("TCGA-44-3919-01A-02R-1107-07", "TCGA-44-2668-01A-01R-A278-07", "TCGA-49-AAQV-01A-11R-A39D-07")
names_output <- c("TCGA-44-3919", "TCGA-44-2668", "TCGA-49-AAQV")
results <- stringr::str_extract(string = names_input, pattern = "^[^-]+-[^-]+-[^-]+")
all.equal(target = names_output, current = results)
#> [1] TRUE

Hope this helps.

1 Like

Perfect, thank you

Simon

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.