How do I combine starts_with AND ends_with when referencing multiple columns?

Let's say we have the following columns (though there can be dozens more):

abc-123
abc*123
abc-789
abc*789
zyx-123
zyx*123
zyx-789
zyx*789

And the goal is to select the two columns abc-789 and abc*789. How can I combine both starts_with('abc') and ends_with('789') for this case?

If you want any columns that start 'abc' and any columns that end '789' you can use c(). If you want only the column names that start with 'abc' and end with '789', you can use a regular expression (regex) with matches().

library(dplyr)

x <- tibble(
  `abc-123` = 1,
  `abc*123` = 1,
  `abc-789` = 1,
  `abc*789` = 1,
  `zyx-123` = 1,
  `zyx*123` = 1,
  `zyx-789` = 1,
  `zyx*789` = 1
)

select(x, c(starts_with("abc"), ends_with("789")))
# A tibble: 1 × 6
  `abc-123` `abc*123` `abc-789` `abc*789` `zyx-789` `zyx*789`
      <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1         1         1         1         1         1         1

select(x, matches("^abc.*789$")) 
# A tibble: 1 × 2
  `abc-789` `abc*789`
      <dbl>     <dbl>
1         1         1

The regex here reads like 'strings that start (^) with abc, then have any number and type of characters, then end ($) with 789.

1 Like
library(tidyverse)

x <- tibble(
  `abc-123` = 1,
  `abc*123` = 1,
  `abc-789` = 1,
  `abc*789` = 1,
  `zyx-123` = 1,
  `zyx*123` = 1,
  `zyx-789` = 1,
  `zyx*789` = 1
)

x %>% select(starts_with("abc") & ends_with("789"))
#> # A tibble: 1 × 2
#>   `abc-789` `abc*789`
#>       <dbl>     <dbl>
#> 1         1         1

Created on 2021-08-17 by the reprex package (v2.0.1)

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.