Need to eliminate the format

I have sample input values with the below format
{XXXX}{P22}
{XXXX}{P265}
{XXXX}{D26}

I need to remove the brackets and the single character before second open bracket (i.e. P). I want the output as below

XXXX22
XXXX265
XXXX26

anyone have tried this kind of handling ??

Hello,

You can do this using some regex and the stringr library

library(stringr)

#The values to change
values = c("{XXXX}{P22}", "{XXXX}{P265}", "{XXXX}{D26}")

#Only keep the groups of interest between ()
values = str_replace(values, "\\{(.*)\\}\\{.(\\d+)\\}", "\\1\\2")

values
#> [1] "XXXX22"  "XXXX265" "XXXX26"

Created on 2020-10-01 by the reprex package (v0.3.0)

Details on the regex

  • a \ means to ignore a special character and just interpret it as as. Since we have to escape \ in R, this becomes a double \\ for the regex pattern
  • The perentheses () denote a capture group, i.e. everything that matches between them is saved
  • .* means everything that matches
  • \d+ means one or more numeric values
  • The \1\2 in the second part refer to the capture groups of the first part at will only display these results

Hope this helps,
PJ

2 Likes

Finally your solution worked.

Thanks PJ

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.