How Can I grepl a word including a letter?

My question is clear I guess. I want to get certain words which are including X,J,Q or W. How can I do that? I find some solutions but they explain if word starts with one of these letters. I want all values including this letter move to another datatable how can I do this? I've tried

grepl("^X.*?") and it didnt work for words that start with X. any suggestions? Thank you.

If you want to check if an entry in your character vector contains any of those letters anywhere within the string, you can use the regex pattern "[XJWQ]", which will search for any of the values inside the square brackets anywhere within the characters:

set.seed(040915)
testers <- stringi::stri_rand_strings(100,5)
found_it <- grepl("[XJQW]",testers)
testers[found_it]

Here is an example what I want to achieve.

DT1 <- data.table(ZipCodes = c("20202","10101","00023","40404","00001","25002"),
                  Cities =c("Xi'an/China","Washington D.C","Johannesburg","NewYork/US","Quebeck","Houston"))
#> ZipCodes      Cities
#> 20202           Xi'an/China
#> 10101           Washington D.C
#> 00023           Johennesburg
#> 40404           NewYork/US
#> 00001           Quebec
#> 25002           Houston

What I want is when I use grepl search entries that has "X W J and Q" in DT1$Cities and grab entire row to DT2. I couldnt manage it with grepl("[XWJQ]",DT1$cities) or grepl("^X.*?",DT1$Cities)

In the end DT2 should be seen like this.

DT2 <- data.table(ZipCodes = c("20202","10101","00023","40404","00001"),
                  Cities =c("Xi'an/China","Washington D.C","Johannesburg","NewYork/US","Quebeck"))

#> ZipCodes      Cities
#> 20202           Xi'an/China
#> 10101           Washington D.C
#> 00023           Johennesburg
#> 40404           NewYork/US
#> 00001           Quebec

In the piece of code you included, the cities variable is in lower case, and based on your target DT2, you want lowercase letters to be matched. Try this:

DT1 <- data.table(ZipCodes = c("20202","10101","00023","40404","00001","25002"),
                  Cities =c("Xi'an/China","Washington D.C","Johannesburg","NewYork/US","Quebeck","Houston"))

DT2 <- DT1[grepl("[XJWQxwjq]",DT1$Cities),]

1 Like

Thank you totally my bad I miss placed the '|' operator last night was super tired.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.