ends_with function is not working properly

Hello Guys,

I have run into a confusion. I am trying to select columns of a table with ends_with function of the dplyr package. It works fine if i use it with out regex. However, in my case i need to use regex. Below is the reprex of what I am trying to do.

I am sure it is something stupid that I am forgetting. Kindly help out.

 library(reprex)

  library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.6.3
#> Warning: package 'forcats' was built under R version 3.6.3
  
  tibble(
    a = c(1,2,1,3,5,8),
    b = c(1,2,3,4,5,6),
    a_11 = c(7,8,9,4,5,6)
  ) -> temp
  
  ## ends_with works with simple charasters
  temp %>% 
    select(ends_with("a")) %>% names()
#> [1] "a"
  
  ## ends with does not work with regex
  temp %>% 
    select(ends_with("_[:digit:]{1,}")) %>% names()
#> character(0)
  
  ## checking if regex is correect
  str_detect(names(temp),"_[:digit:]{1,}" )
#> [1] FALSE FALSE  TRUE

Created on 2020-04-26 by the reprex package (v0.3.0)

ends with simply does not seem to work with regex. That is

temp %>% 
  select(ends_with(".")) %>% names()

gives no result either.

1 Like

You should use matches() to match a regular expression. Adding a $ at the end would simulate what you would hope to achieve with ends_with().

Hello @siddharthprabhu,
Thanks for reverting quickly. I tried what you suggested. It still didn't work. Below is the reprex.

Kindly point out if there is a silly syntax mistake.

library(reprex)

  library(tidyverse)
#> Warning: package 'dplyr' was built under R version 3.6.3
#> Warning: package 'forcats' was built under R version 3.6.3
  
  tibble(
    a = c(1,2,1,3,5,8),
    b = c(1,2,3,4,5,6),
    a_11 = c(7,8,9,4,5,6)
  ) -> temp
  
  ## ends_with works with simple charasters
  temp %>% 
    select(ends_with("a")) %>% names()
#> [1] "a"
  
  ## ends with does not work with regex
  temp %>% 
    select(ends_with(paste("_[:digit:]{1,}"))) %>% names()
#> character(0)
  temp %>% 
    select(ends_with("_[:digit:]{1,}")) %>% names()
#> character(0)
  ## checking if regex is correect
  str_detect(names(temp),"_[:digit:]{1,}" )
#> [1] FALSE FALSE  TRUE

  ## Trying matches funciton
  temp %>% 
    select(matches("_[:digit:]{1,}")) %>% names()
#> character(0)
  temp %>% 
    select(matches("_[:digit:]{1,}$")) %>% names() ## Not sure if this is whatwas meant by adding a '$'
#> character(0)

Created on 2020-04-27 by the reprex package (v0.3.0)

Can you please explain what pattern you are hoping to match? Do you want all columns which end in _ followed by one or more numbers? You could do that with the following regex.

library(tidyverse)

tibble(
  a = c(1,2,1,3,5,8),
  b = c(1,2,3,4,5,6),
  a_11 = c(7,8,9,4,5,6)
) -> temp

temp %>% select(matches("_\\d+$")) %>% names()
#> [1] "a_11"

Created on 2020-04-27 by the reprex package (v0.3.0)

1 Like

Hey @siddharthprabhu,
You have guessed it correct. I am trying to extract all columns ending with _ followed by one or more numbers.

Using "_\d+$" works . Thanks a ton.

However, I do not understand why would "_[:digit:]{1,}$" would not work with either of matches or ends_with. However, it works fine fine with str_detect. Not sure why would this happen. Maybe I will look into regex and find a reason.

Thanks again

I'm not sure either. It's possible that the regex implementations underlying stringr and tidyselect are different but this is just a wild guess. Maybe someone else will know.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.