new to R. I have a dataset that contatins: Date, Day, Time, location in one variable. I need to change that into 4 separate variables (Date, Day, Time, location) :
Data (not all values have the same # of spacings):
I used the str_extract function from the stringr package and several regular expressions. There are many on line tutorial for regular expressions if you are not familiar with them.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(stringr)
DF <- data.frame(ItemLocation=c("Pickup Details:6. 11/21 SAT 4:40 - 5:00 @ Mountain View @ Ranch 99",
"Pickup Details:4. 10/22 THU 4:40 - 5:00 @ San Mateo @ Marina",
"Pickup Details:5. 10/09 FRI 5:30 - 5:50 @ Cupertino"))
DFnew <- DF %>% mutate(
Date=paste0(str_extract(DF$ItemLocation,"\\d+/\\d+"),"/2020"),
Day=str_extract(DF$ItemLocation,"(?<=/\\d{1,2} {1,4})[A-Z]{3}"),
Time=str_trim(str_extract(DF$ItemLocation,"(?<=[A-Z]{3} {1,4}).+?(?=@)")),
Location=str_trim(str_extract(DF$ItemLocation,"(?<=@).+$"))
)
DFnew
#> ItemLocation Date
#> 1 Pickup Details:6. 11/21 SAT 4:40 - 5:00 @ Mountain View @ Ranch 99 11/21/2020
#> 2 Pickup Details:4. 10/22 THU 4:40 - 5:00 @ San Mateo @ Marina 10/22/2020
#> 3 Pickup Details:5. 10/09 FRI 5:30 - 5:50 @ Cupertino 10/09/2020
#> Day Time Location
#> 1 SAT 4:40 - 5:00 Mountain View @ Ranch 99
#> 2 THU 4:40 - 5:00 San Mateo @ Marina
#> 3 FRI 5:30 - 5:50 Cupertino