How do I combine starts_with AND ends_with when referencing multiple columns?

pathos · August 17, 2021, 4:07pm

Let's say we have the following columns (though there can be dozens more):

abc-123
abc*123
abc-789
abc*789
zyx-123
zyx*123
zyx-789
zyx*789

And the goal is to select the two columns abc-789 and abc*789. How can I combine both starts_with('abc') and ends_with('789') for this case?

matt · August 17, 2021, 4:20pm

If you want any columns that start 'abc' and any columns that end '789' you can use c(). If you want only the column names that start with 'abc' and end with '789', you can use a regular expression (regex) with matches().

library(dplyr)

x <- tibble(
  `abc-123` = 1,
  `abc*123` = 1,
  `abc-789` = 1,
  `abc*789` = 1,
  `zyx-123` = 1,
  `zyx*123` = 1,
  `zyx-789` = 1,
  `zyx*789` = 1
)

select(x, c(starts_with("abc"), ends_with("789")))
# A tibble: 1 × 6
  `abc-123` `abc*123` `abc-789` `abc*789` `zyx-789` `zyx*789`
      <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1         1         1         1         1         1         1

select(x, matches("^abc.*789$")) 
# A tibble: 1 × 2
  `abc-789` `abc*789`
      <dbl>     <dbl>
1         1         1

The regex here reads like 'strings that start (^) with abc, then have any number and type of characters, then end ($) with 789.

EconProf · August 17, 2021, 4:29pm

library(tidyverse)

x <- tibble(
  `abc-123` = 1,
  `abc*123` = 1,
  `abc-789` = 1,
  `abc*789` = 1,
  `zyx-123` = 1,
  `zyx*123` = 1,
  `zyx-789` = 1,
  `zyx*789` = 1
)

x %>% select(starts_with("abc") & ends_with("789"))
#> # A tibble: 1 × 2
#>   `abc-789` `abc*789`
#>       <dbl>     <dbl>
#> 1         1         1

^{Created on 2021-08-17 by the reprex package (v2.0.1)}

system · August 24, 2021, 4:29pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.