Trim strings ending with various characters

I have a vector of strings like below:

x <- c("AA.3", "BEAM.2", "BRK.B", "BF.B", "CB.1", "DD.2",  
       "IGT.1", "LSI.1",  "NSM.2",  "PLL.1",  "SUN.1",  "DTV.2", 
       "ALTR.1", "NVLS.1", "AGN.2",  "CPWR.1", "LIFE.3", "FTI.1", 
       "SE.7",  "MMI.3", "ABC", "AS.")

I need to trim these strings. For the strings ending with a dot and number (.1 for example), remove the dot and number. For strings ending with a dot, remove the dot. For the strings ending with a dot and letter (.B for example), do not change them. For other strings, do not change them.

I am looking at the stri_trim_right function from the package stringi, but could not figure out how to do it.

what I would try is to first separate it using the dot and then put it back together according to the rules you indicate

This is a job for regular expressions

library(stringr)

x <- c("AA.3", "BEAM.2", "BRK.B", "BF.B", "CB.1", "DD.2",  
       "IGT.1", "LSI.1",  "NSM.2",  "PLL.1",  "SUN.1",  "DTV.2", 
       "ALTR.1", "NVLS.1", "AGN.2",  "CPWR.1", "LIFE.3", "FTI.1", 
       "SE.7",  "MMI.3", "ABC", "AS.")

str_remove(x, "\\.\\d?$")
#>  [1] "AA"    "BEAM"  "BRK.B" "BF.B"  "CB"    "DD"    "IGT"   "LSI"   "NSM"  
#> [10] "PLL"   "SUN"   "DTV"   "ALTR"  "NVLS"  "AGN"   "CPWR"  "LIFE"  "FTI"  
#> [19] "SE"    "MMI"   "ABC"   "AS"

Created on 2021-01-01 by the reprex package (v0.3.0.9001)

Thanks so much. My data is large and I prefer to use the package stringi, which is much faster than stringr. Do you know how to do it in stringi? Thanks.

stringr uses stringi as backend so it shouldn't make much of a difference but you can use this function

library(stringi)

x <- c("AA.3", "BEAM.2", "BRK.B", "BF.B", "CB.1", "DD.2",  
       "IGT.1", "LSI.1",  "NSM.2",  "PLL.1",  "SUN.1",  "DTV.2", 
       "ALTR.1", "NVLS.1", "AGN.2",  "CPWR.1", "LIFE.3", "FTI.1", 
       "SE.7",  "MMI.3", "ABC", "AS.")

stri_replace(x, "", regex = "\\.\\d?$")
#>  [1] "AA"    "BEAM"  "BRK.B" "BF.B"  "CB"    "DD"    "IGT"   "LSI"   "NSM"  
#> [10] "PLL"   "SUN"   "DTV"   "ALTR"  "NVLS"  "AGN"   "CPWR"  "LIFE"  "FTI"  
#> [19] "SE"    "MMI"   "ABC"   "AS"

Created on 2021-01-01 by the reprex package (v0.3.0.9001)

Thanks so much! Are there any learning materials that can help me understand what these regex means?

A general overview

And a more complete resource

Thanks so much! I will read the first link.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.