Define "into" using "col" when separating

tidyr
tidyeval
#1

I'm curious if there's a way to programmatically assign the into argument of tidyr::separate based on the col argument. The data I receive (a sample is below!) often contains hyphenated columns like "GP-GS" or "sb-att".

library(tidyverse)
test <- tribble(~Player, ~"gp-gs", ~ab, ~"sb-att",
        "A", "16-16", 68, "6-9",
        "B", "10-6", 26, "0-1",
        "C", "14-9", 69, "5-9",
        )

My first thought was to create a new function separate2 and use that:

separate2 <- function(data, col, ...) {
  
  separate(data, col,
           into = c(str_extract(col, "^[^-]+"),
                    str_extract(col,"[^-]+$")),
           sep = "[-]+", 
           extra = "merge")
}

separate2(test,"gp-gs")

While this works, col must be quoted for it to work. Tidyeval may be a means to providing functionality similar to tidyr::separate, but I'm unsure where to begin.

As a bonus question: how might one apply the new separate2 function to all columns containing a hyphen (as opposed to explicitly naming the columns, which are inconsistently named across data sources)? A for loop certainly works, but is there a more "tidy" solution?

for (col in str_subset(names(test),"-")) {
  test <- separate2(test,col)
}

Thanks in advance!

0 Likes

#2

You can do both of those things, but I'm not sure whether it improves things, to be honest :slight_smile:

library(tidyverse)
test <- tribble(~Player, ~"gp-gs", ~ab, ~"sb-att",
                "A", "16-16", 68, "6-9",
                "B", "10-6", 26, "0-1",
                "C", "14-9", 69, "5-9",
)

separate2 <- function(data, col, ...) {
  col <- rlang::as_name(rlang::ensym(col))
  
  separate(data, col,
           into = c(str_extract(col, "^[^-]+"),
                    str_extract(col, "[^-]+$")),
           sep = "[-]+", 
           extra = "merge")
}

funs <- purrr::map(str_subset(names(test),"-"), function(col){
  purrr::partial(separate2, col = !!col)
}) 
fun <- purrr::compose(!!!funs)

fun(test)
#> # A tibble: 3 x 6
#>   Player gp    gs       ab sb    att  
#>   <chr>  <chr> <chr> <dbl> <chr> <chr>
#> 1 A      16    16       68 6     9    
#> 2 B      10    6        26 0     1    
#> 3 C      14    9        69 5     9

Created on 2019-03-20 by the reprex package (v0.2.1)

0 Likes

#3

Misha,

Thanks for the feedback! That's working (for the time being...) on my end. I appreciate the help!

0 Likes

#4

If your question's been answered (even if by you), would you mind choosing a solution? (See FAQ below for how).

Having questions checked as resolved makes it a bit easier to navigate the site visually and see which threads still need help.

Thanks

1 Like

closed #5

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

0 Likes