I'm an admitted beginner with stringr and regular expressions. I've been struggling with this problem. Any help or advice will be appreciated.
My objective is to normalize different sets of Manufacturer Part Numbers for improved matching.
I want to remove all characters except letters and numbers and convert letters to lowercase.
I think I've found a way to do get either all letters or all numbers, but not both at the same time.
library(stringr)
library(reprex)
#> Warning: package 'reprex' was built under R version 3.4.2
part_num <- c("X17-L", "36-110pc_BL/S", "#008 5")
str_replace_all(part_num, "[^a-zA-Z]", "") %>% str_to_lower()
#> [1] "xl" "pcbls" ""
str_replace_all(part_num, "[^0-9]", "")
#> [1] "17" "36110" "0085"
# I want to combine the two lines above
# Line below does not work because of "OR" condition
# Is there an "AND" condition?
str_replace_all(part_num, "[^a-zA-Z]|[^0-9]", "")
#> [1] "" "" ""
The desired result is:
"x17l" "36110pcbls" "0085"