grep excluded pattern

How to configure the pattern inside the grep function to exclude "CTT_OMT"?

txt<-c("OAT","HST","OCT","OMT","CTT","CTT_OMT")
grep("[OACM]T$",txt,value = T)
[1] "OAT"     "OCT"     "OMT"     "CTT_OMT"

> grep("[OACMT{2}]",txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT"     "CTT_OMT"
> grep("[OACMT{2}]",txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT"     "CTT_OMT"
> grep("[OACMT{1,2}]",txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT"     "CTT_OMT"
> grep("[OACMT^_]",txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT"     "CTT_OMT"
> grep("[OACMT^.]",txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT"     "CTT_OMT"

the desired result is:

 "OAT"     "OCT"     "OMT"     "CTT"

Gracias

Perhaps this is cheating.

txt<-c("OAT","HST","OCT","OMT","CTT","CTT_OMT")

grep(pattern = "^[OC].T$", x = txt, value = TRUE)
#> [1] "OAT" "OCT" "OMT" "CTT"

Created on 2020-05-10 by the reprex package (v0.3.0)

1 Like

Why would this be cheating?

I need to readjust the pattern to this new need: in the variable, text, "OOT" is found. But I need to exclude her too.

txt<-c("OAT","HST","OCT","OMT","CTT","CTT_OMT","OOT")

the output should be like the previous one, but it should also show "HST"

> [1] "OAT" "HST" "OCT" "OMT" "CTT"

But THIS is cheating: changing the rules after starting the game.

Two thoughts:

  • the solution of FJCC should give you some ideas
  • maybe it is wise to specify the real problem:the two cases you mention appear to be just symptoms.
1 Like

My attempts, but without results !!!
How to join the patterns of the two independent grep, in one, and obtain the results

 txt<-c("OAT","HST","OCT","OMT","CTT","CTT_OMT","OOT")
> grep(pattern = "^[OHC].T$", x = txt, value = TRUE)
[1] "OAT" "HST" "OCT" "OMT" "CTT" "OOT"
> #1
> txt2<-grep(pattern = "^[OHC].T$", x = txt, value = TRUE)
> grep(pattern = '[^"OOT"]', x = txt2, value = TRUE)
[1] "OAT" "HST" "OCT" "OMT" "CTT"
> #I would use two grep calls:
> #2
> grep("OOT", grep('^[OHC].T$', txt, value = T),value = T,invert=TRUE)
[1] "OAT" "HST" "OCT" "OMT" "CTT"
> #3
> grep('[^"OOT"]', grep('^[OHC].T$', txt, value = T),value = T)
[1] "OAT" "HST" "OCT" "OMT" "CTT"
> #4
> intersect(grep('^[OHC].T$',txt,value = T),grep("OOT",txt,invert=TRUE,value = T))
[1] "OAT" "HST" "OCT" "OMT" "CTT"
> ###without results !!!
> grep('[^\\bOOT]', txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT"     "CTT_OMT"
> grep('([^"OOT"].$)', txt,value = T)
[1] "OAT"     "HST"     "OCT"     "OMT"     "CTT_OMT"
> 

> grep('^([?CM]|[^\\b_])*$', txt,value = T)
[1] "OAT" "HST" "OCT" "OMT" "CTT" "OOT"
> 

  1. OC ↩ī¸Ž

maybe use grepl to get true false indicators of what text match simple regex, and 'and' together the grepl results to find the text entries that meet all of them.

txt<-c("OAT","HST","OCT","OMT","CTT","CTT_OMT")

g1<- grepl("\\C+",txt)
g2<- grepl("\\M+",txt)


g1 & g2

txt[g1&g2]
> g1<- grepl("\\OO+",txt)
> g2<- grepl("\\_+",txt)
> g1 | g2
[1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
> txt[!(g1|g2)]
[1] "OAT" "HST" "OCT" "OMT" "CTT" #the desired result
> 

How to join in a single pattern "\ OO +" and "\ _ +". to get the result?

are you asking how to write a single regex ?

g3 <- grepl("\\00+|\\_+",txt)
txt[!g3]

This works but again seems very specific to the list presented.

txt<-c("OAT","HST","OCT","OMT","CTT","CTT_OMT","OOT")
grep(pattern = "^[OHC][^O]T$", x = txt, value = TRUE)
#> [1] "OAT" "HST" "OCT" "OMT" "CTT"

Created on 2020-05-11 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.