I have an aphorism that almost every time I reach for stringr what I really want is tidyr::separate. However, today I came up with what I thought was a simple example of string manipulation in class and ended up with a much more complicated solution than I would have liked. The idea is a variable with street names, and we want to pull off the "suffixes" (St, Dr, Ave, etc).
Here was my first (non-functional) attempt:
library(Stat2Data)
library(dplyr)
library(tidyr)
data("RailsTrails")
RailsTrails <- RailsTrails %>%
separate(StreetName, into = c("name", "kind"), sep = " ", extra = "merge")
This doesn't work, because of multi-word street names, so the first break goes into the first variable and the second two pieces are merged. Is there a way to modify this separate() call so it does what I want?
Since I couldn't figure that out, I tried using str_split() and ended up with this,
If you really want to use separate, this seems to work, though I think it is evil.
library(tidyr)
DF <- data.frame(Street = c("Main St", "Pretty Tree Dr", "Three Word Name Ave"))
DF2 <- DF %>% separate(col = "Street", into = c("Name", "Type"),
sep = " (?=[^ ]+$)",
remove = FALSE)
DF2
#> Street Name Type
#> 1 Main St Main St
#> 2 Pretty Tree Dr Pretty Tree Dr
#> 3 Three Word Name Ave Three Word Name Ave