Use of regexp argument in fs::dir_ls()

library(fs)
getwd()                        # In case this matters, although reprex creates
#> [1] "/private/var/folders/mk/lh99bg295msg8myvcf5yczkc0000gn/T/RtmpgFcnqm/reprex99941943cd46"
file_create("ps.xls")          # its own temp directory.
file_create("cont.xls")
dir_ls(".", regexp = "p*xls")  # I expect this to only return ps.xls  
#> cont.xls ps.xls

Created on 2019-11-24 by the reprex package (v0.3.0)

I would expect this to return just ps.xls. How can regular expression "p*xls" match "cont.xls"?

2 Likes

Because in regular expressions the "*" metacharacter is a quantifier which means "zero or more times", so you basically are asking for "p" zero or more times followed by "xlsx" and "cont.xls" satisfies that condition, maybe you want this instead ^p.+xls$

1 Like

Thanks for the reprex for the regexp!

The pattern

"p*xls"

translates to

zero or more occurrences of the character p, followed by the string xls

So, in the immortal words of Keith and Mick

You can't always get what you want

To get what you need, the pattern should be

"p.*xls", which translates as the character p followed by zero or more occurrences of any character, followed by the xls string.

I've been using grep and it's progeny going back to the first Reagan administration, and your question illustrates exactly the kind of problem that I still have to overcome all the time. Regular expressions are so darned powerful that it's easy to lose track of the picky syntax that makes it so.

2 Likes

Thanks to both andresrcs and technocrat for the help!

1 Like

Thanks for marking a solution (either would have done equally well). It really cuts down on the effort required for site users to zero in on useful answers.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.