Hello, I am currently working on a project that requires string matching. I am having an issue with string matching for one specific drug. I need to match statin to atorvastatin but when I use the grepl function, other drugs such as Bio-statin, sandostatin, somatostatin, etc. are picked up but shouldn't be. Are there any suggestions on how to fix this issue?
Hi @mcnealmm,
Welcome to the RStudio Community Forum.
Its not clear from your question if you want to replace particular matched strings, or obtain their location in a character vector. Hopefully, the following code will show you how to do both these things. If you need further help you will need to post a Reproducible Example (see the posting guide at the top of this forum).
drug <- c("atorvastatin", "Bio-statin", "sandostatin", "somatostatin", "atorvastatin")
gsub("atorvastatin", "statin", drug) # Replaces the matched string with another
#> [1] "statin" "Bio-statin" "sandostatin" "somatostatin" "statin"
grepl("statin", drug) # All elements of 'drug' contain the substring 'statin'
#> [1] TRUE TRUE TRUE TRUE TRUE
grepl("atorvastatin", drug) # Only two contain the string 'atorvastatin'
#> [1] TRUE FALSE FALSE FALSE TRUE
drug[grepl("atorvastatin", drug)==TRUE]
#> [1] "atorvastatin" "atorvastatin"
which(grepl("atorvastatin", drug)==TRUE)
#> [1] 1 5
Created on 2020-07-03 by the reprex package (v0.3.0)
HTH
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.