bug or feature?: data viewer renders repeated whitespaces as single whitespace

I just figured out why some strings were not matched by a regular expression and learned that the data viewer is rendering repeated whitespaces as a single one (from now on I will be using str_squish().).

example:

View(rbind("foo bar", "foo  bar"))

I imagine there might be datatypes where this is intended? probably related: https://datatables.net/forums/discussion/43122/space-in-fields

Hi marco, I noticed this as well but I actually used it to my advantage when I was importing messy data that were originally .txt files. To scrape it in, I had to work around a lot of arbitrary, inconsistent white spaces. I can see how it can be an obstacle when you want to match exact whitespace lengths.

My stringr cheat sheet also says that for regular expressions, \s means "any whitespace", while [:space:] means "space characters" and [:blank:] means "space and tab (but not new line)". I wonder if you can match a [:space:]{1} pattern where {n} quantifies "exactly n times."

Not sure if [:space:] is effectively the same as \s.

This is a bug in old versions of the data viewer:

It's fixed in the current release of RStudio (1.2).

I'm pretty sure [[:space:]] is equivalent to \s.

And for the quantifier: that would be what the {n} is for, right? e.g.

string <- c("foo bar", "foo  bar")
stringr::str_detect(string, "[[:space:]]{2}")
#> [1] FALSE  TRUE

Created on 2019-04-18 by the reprex package (v0.2.1)

Ah great. thank you.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.