Mutating three variables from a variable: separate() or something else? [sep: "(" and space]

Suppose df contains a variable v, from which I want to mutate three variables: anything before "(" will be the value of the first variable; anything after "(" but before space will be the value of the second variable; anything after space will be the value of the third variable.

df

v
mean(air_temperature PT30M)
max(soil_temperature PT5M)

df_wanted

method datatype period
mean air_temperature PT30M
max soil_temperature PT5M
library(tidyverse)
# Toy data
df <- tibble(
  v = c("mean(air_temperature PT30M)", "max(soil_temperature PT5M)")
)

# What I tried
df |> 
  separate(v, into = c("method", "dt_period"), sep = "(")
#> Warning in gregexpr(pattern, x, perl = TRUE): PCRE pattern compilation error
#>  'missing closing parenthesis'
#>  at ''
#> Error in gregexpr(pattern, x, perl = TRUE): invalid regular expression '('

Created on 2022-12-17 with reprex v2.0.2

library(tidyverse)
# Toy data
df1 <- tibble(
  v = c("mean(air_temperature PT30M)", "max(soil_temperature PT5M)")
)

df1 |> 
  extract(v, into = c("method", "dt","period"),
          regex="(\\w+)\\((\\w+)\\s(\\w+)\\)")
1 Like

@nirgrahamuk Many thanks for the neat solution! Could you please teach me this part "(\\w+)\\((\\w+)\\s(\\w+)\\)"? I can see a pattern, yet, I don't want to guess :slight_smile:

This isn't your solution, but might help you understand

df |> 
  separate(v, into = c("method", "dt_period"), sep = "\\(")
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.