Extract content inside last parentheses in a string

Hi!

I'm a newbie when it comes to regular expressions and I'm struggling with a problem. I want to extract the content inside the last parenthesis of a string. This parenthesis is at the end of the string, but I assume that it should be possible to extract it regardless of that. I found an example that works fine when there's only one parenthesis, see the first observation below. But I need a solution that works in cases where there is more than one parenthesis within the string.

Any help would be greatly appreciated.

Best, Richard

library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.0.4
#> Warning: package 'tidyr' was built under R version 4.0.4
#> Warning: package 'dplyr' was built under R version 4.0.4
#> Warning: package 'forcats' was built under R version 4.0.4

df <- tribble(
  ~group,                       ~textstring,
     "A",       "This is one (example)",
     "B", "This (is) another (example)",
     "C",   "This is (a 3rd) (example)"
  )

df %>% 
  mutate(newvar = str_match(textstring, "(?<=\\().*(?=\\))"))
#> # A tibble: 3 x 3
#>   group textstring                  newvar[,1]          
#>   <chr> <chr>                       <chr>               
#> 1 A     This is one (example)       example             
#> 2 B     This (is) another (example) is) another (example
#> 3 C     This is (a 3rd) (example)   a 3rd) (example

Created on 2021-03-26 by the reprex package (v1.0.0)

Sorry, I was supposed to write str_extract. But the question is the same, i.e., how do I extract the information from only within the final parenthesis of the string.

library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.0.4
#> Warning: package 'tidyr' was built under R version 4.0.4
#> Warning: package 'dplyr' was built under R version 4.0.4
#> Warning: package 'forcats' was built under R version 4.0.4

df <- tribble(
  ~group,                       ~textstring,
  "A",       "This is one (example)",
  "B", "This (is) another (example)",
  "C",   "This is (a 3rd) (example)"
)

df %>% 
  mutate(newvar = str_extract(textstring, "(?<=\\().*(?=\\))"))
#> # A tibble: 3 x 3
#>   group textstring                  newvar              
#>   <chr> <chr>                       <chr>               
#> 1 A     This is one (example)       example             
#> 2 B     This (is) another (example) is) another (example
#> 3 C     This is (a 3rd) (example)   a 3rd) (example

Created on 2021-03-26 by the reprex package (v1.0.0)

Does it help?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)

df <- tribble(
    ~group,                       ~textstring,
    "A",       "This is one (example)",
    "B", "This (is) another (example)",
    "C",   "This is (a 3rd) (example)"
)

df %>% 
    mutate(newvar = str_extract(textstring, "(?<=\\()([^()]*?)(?=\\)[^()]*$)"))
#> # A tibble: 3 x 3
#>   group textstring                  newvar 
#>   <chr> <chr>                       <chr>  
#> 1 A     This is one (example)       example
#> 2 B     This (is) another (example) example
#> 3 C     This is (a 3rd) (example)   example

Created on 2021-03-26 by the reprex package (v1.0.0)

1 Like

That works beautifully, thank you so much!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.