Regular expressions containing special characters

Rsky · May 26, 2021, 6:54am

chara <- ".a.b.c (good) .e.f.g (bad) "
str_replace_all(chara,".","_")
str_extract(chara,"(.*)")

I want to get the characters in "()" and the characters after "." and put each of them in col_1, col_2, ... col_8.

like this

tibble(col_1 = "a",col_2="b",....col_8="bad")

When I look into it, I find that . and () have a special meaning in pattern because they are used to search for characters.

Is there any way to replace them like in janitor::clean_names()?
Or how can I specify it in pattern?

thank you

Yarnabrina · May 26, 2021, 7:25am

You can do something like this, and modify for your use:

> chara <- ".a.b.c (good) .e.f.g (bad) "
> stringr::str_extract_all(chara, r"((?<=\.)[a-z]|(?<=\()[a-z]+?(?=\)))")
[[1]]
[1] "a"    "b"    "c"    "good" "e"    "f"    "g"    "bad" 

>

Regarding your question:

I haven't used this function myself, but you can escape special characters using \. This itself is a special character, and needs to be escaped as well.

Hope this helps.

technocrat · May 26, 2021, 7:43am

alpha <- "[:alpha:]+"
chara <- ".a.b.c (good) .e.f.g (bad) "
data.frame(stringr::str_extract_all(chara,alpha,simplify = TRUE)) -> dat
colnames(dat) <- rep(paste("col_",1:length(dat)))
dat
#>   col_ 1 col_ 2 col_ 3 col_ 4 col_ 5 col_ 6 col_ 7 col_ 8
#> 1      a      b      c   good      e      f      g    bad

Updated to change

to make the third positional argument explicit. H/T @Yarnabrina

mara · May 26, 2021, 12:44pm

FWIW, there's a vignette on regular expressions in stringr that you might want to check out. In fact, it even covers one of your precise cases!

If “ . ” matches any character, how do you match a literal “ . ”? You need to use an “escape” to tell the regular expression you want to match it exactly, not use its special behaviour. Like strings, regexps use the backslash, \ , to escape special behaviour. So to match an . , you need the regexp \. . Unfortunately this creates a problem. We use strings to represent regular expressions, and \ is also used as an escape symbol in strings. So to create the regular expression \. we need the string "\\." .

system · June 16, 2021, 12:44pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.