dat
creates a character vector of the strings to be split; for this question that's really all that's needed for a reprex
.
Examination of dat
shows a consistent pattern of separation into a forepart before the first curved bracket (
and an aftpart from the first curved bracket until the end of the string.
Two regular expressions are composed to distinguish the two parts. The first specifies the beginning ^
of the string followed by anything .*
followed by (
, which is a special character which must be escaped with \\(
. The second specifies a blank followed by (
and then anything .*
to the end $
of the character.
The second search patterns leaves some leftover characters, either )
or -
that the third and fourth expressions address.
main creates a character vector of the forepart and a separate vector of the aftpart, in three steps—the portion leftover after forepart is removed which is then piped |>
to remove the trailing )
and piped again to remove the trailing -
.
# data
dat <- c("Acute respiratory failure (2d - Fatal - Results in Death, Caused/Prolonged Hospitalisation)",
"Bronchial secretion retention (n/a - Fatal - Results in Death)",
"Cardiac arrest (n/a - Fatal - Results in Death)",
"Hypoxic-ischaemic encephalopathy (2d - Fatal - Results in Death)",
"Metabolic acidosis (n/a - Unknown - Other Medically Important Condition)",
"Pneumonia (n/a - Not Recovered/Not Resolved - Caused/Prolonged Hospitalisation)",
"Dysphagia (n/a - Unknown - )",
"Hypotonia (n/a - Unknown - )",
"Muscular weakness (n/a - Unknown - )")
# patterns
forepart <- "^.*\\("
aftpart <- " \\(.*$"
paren <- "\\)$"
trail <- " - $"
# functions
snip_aft <- function(x) gsub(aftpart,"",x)
snip_fore <- function(x) gsub(forepart,"",x)
snip_paren <- function(x) gsub(paren,"",x)
snip_trail <- function(x) gsub(trail,"",x)
# main
part1 <- snip_aft(dat)
part2 <- snip_fore(dat) |>
snip_paren(x = _) |>
snip_trail(x = _)
DF <- data.frame(part1 = part1, part2 = part2)
DF
#> part1
#> 1 Acute respiratory failure
#> 2 Bronchial secretion retention
#> 3 Cardiac arrest
#> 4 Hypoxic-ischaemic encephalopathy
#> 5 Metabolic acidosis
#> 6 Pneumonia
#> 7 Dysphagia
#> 8 Hypotonia
#> 9 Muscular weakness
#> part2
#> 1 2d - Fatal - Results in Death, Caused/Prolonged Hospitalisation
#> 2 n/a - Fatal - Results in Death
#> 3 n/a - Fatal - Results in Death
#> 4 2d - Fatal - Results in Death
#> 5 n/a - Unknown - Other Medically Important Condition
#> 6 n/a - Not Recovered/Not Resolved - Caused/Prolonged Hospitalisation
#> 7 n/a - Unknown
#> 8 n/a - Unknown
#> 9 n/a - Unknown