str_remove() regex pattern

Hi,

I've been trying to figure out it for a while.

I'd like to apply

library(stringr)
str_extract()

and get everything after "_". How you say 'everything' and 'after' in regex?

Many thanks,

Jakub

Hi,

Here is a way to do that

library(stringr)


str_extract("before_after", "(?<=_).*")
#> [1] "after"
str_remove("before_after", "^.*_")
#> [1] "after"

Created on 2022-04-05 by the reprex package (v2.0.1)

Hope this helps,
PJ

1 Like

Two possible versions of "everything" are .+ and .*. The . means "any character", the + means "one or more" and the * means "zero or more". So, .+ means "one or more of any character" and .* means "zero or more of any character".
One way to say "after an _" is to say "look back for an _". The is done with "(?<=_)". You can say "look back for an _ and match one or more of any characters until the end of the input" with (?<=_).+$. The $ means "the end of the input".

str_extract(c("ert_67fgt","cvcfd_kio"),"(?<=_).+$")
[1] "67fgt" "kio"

Another way to match from an _ to the end of the input is to say "match one or more of any character that is not an _ until the end of the input". The expression for "not an _" is [^_], so the full expression is "[^_]+$"

  
> str_extract(c("ert_67fgt","cvcfd_kio"),"[^_]+$")
[1] "67fgt" "kio"
2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.