Regular expression - help

Hi, I have the following ver long expression:

"@name37:51BLA BLA@othername38:00BLA@otherothername" and so on.

Would anyone suggest me how to use regular expressions to clean this string?

I would like to keep by separate:



I would really appreciate your help

The regex below uses two positive lookaheads and a positive lookbehind to split the string at the desired locations without consuming any characters. The map at the end breaks the result up into a list containing each @name and its associated strings.


x = "@name37:51BLA BLA@othername38:00BLA@otherothername" 

xs = str_split(x, "(?=@)|(?=\\d{2}:\\d{2})|(?<=\\d{2}:\\d{2})", simplify=TRUE) %>% 

map(seq(1,length(xs), 3), ~xs[.x:(.x+2)])
#> [[1]]
#> [1] "@name"   "37:51"   "BLA BLA"
#> [[2]]
#> [1] "@othername" "38:00"      "BLA"       
#> [[3]]
#> [1] "@otherothername" NA                NA

Created on 2020-10-25 by the reprex package (v0.3.0)

If you want a data frame, you can do:

map_df(seq(1,length(xs), 3), ~xs[.x:(.x+2)] %>% 
         set_names("name", "time", "text"))
  name            time  text   
  <chr>           <chr> <chr>  
1 @name           37:51 BLA BLA
2 @othername      38:00 BLA    
3 @otherothername NA    NA  
1 Like

Thanks!! This was exactly was I was looking for


Just a very quick follow-up on this question. Now I have another set of x's on the form:

x1 = @name1:4:51BLE BLE@othername1:18:00BLE@otherothername

Would it be any way to modify the code you proposed to make it work on these cases? I'm not thinking of a general code that would work for x and x1, but I'm thinking in an adaptation of your xs just for the cases of x1

@joels your help in the last comment has been very useful

I solved it using the expression


1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.