Hi. I'd like to filter "hms" type data by using a "hh:mm:ss" format like "filter(ride_length > '00:10:00')", but I couldn't filter them properly. When I filtered them in the "secs" format like "filter(ride_length > 10*60) ", it seemed work. So, I'm assuming that I need to format the "ride_length" data in "hh:mm:ss", and then perform filter function, but I have no ideas to do that.
Thank you very much for any help you can offer!
Below are the codes:
# Pre-processing
> data <- lapply(csv_list, read_csv) %>%
+ bind_rows() %>%
+ na.omit(data) %>%
+ filter(ride_length > 0)
# Check the structure of the "ride_length" column
> data %>%
+ select(ride_length) %>%
+ str()
tibble [4,640,811 × 1] (S3: tbl_df/tbl/data.frame)
$ ride_length: 'hms' num [1:4640811] 00:14:04 00:05:55 00:48:07 00:06:28 ...
..- attr(*, "units")= chr "secs"
- attr(*, "na.action")= 'omit' Named int [1:1082175] 3 23 26 28 34 36 37 42 46 52 ...
..- attr(*, "names")= chr [1:1082175] "3" "23" "26" "28" ...
# Check the data of the "ride_length" column
> data %>%
+ select(ride_length) %>%
+ head(5)
# A tibble: 5 × 1
ride_length
<time>
1 14'04"
2 05'55"
3 48'07"
4 06'28"
5 09'09"
# Set the filter to show the data that has over 10 mins (it didn't work)
> data %>%
+ select(ride_length) %>%
+ filter(ride_length > '00:10:00') %>%
+ head(5)
# A tibble: 5 × 1
ride_length
<time>
1 14'04"
2 05'55"
3 48'07"
4 06'28"
5 09'09"
# Set the filter to show the data that has over 10 mins (= 600 secs) in a different way (it seemed work)
> data %>%
+ select(ride_length) %>%
+ filter(ride_length > 10*60) %>%
+ head(5)
# A tibble: 5 × 1
ride_length
<time>
1 14'04"
2 48'07"
3 13'14"
4 36'25"
5 14'20"