group_by() doesn't understand lubridate::interval() ?

dplyr::group_by() doesn't seem to understand lubridate::interval(). All intervals are treated as identical. What gives? Thank you for any insight!

suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(tibble))
suppressPackageStartupMessages(library(lubridate))

date <- seq.Date(from = as.Date("2000-01-01"),length.out = 360, by = 1)
df <- date %>% enframe(value = "date") %>% 
  mutate(interval = lubridate::interval(start = date, end = date + 7))

# what I expect. dates are unique so group count equals row count
df %>% group_by(date) %>% count() %>% head(5)
#> # A tibble: 5 x 2
#> # Groups:   date [5]
#>   date           n
#>   <date>     <int>
#> 1 2000-01-01     1
#> 2 2000-01-02     1
#> 3 2000-01-03     1
#> 4 2000-01-04     1
#> 5 2000-01-05     1

# NOT what I expect. group_by treats all intervals as one group 
# even though they are unique
df %>% group_by(interval) %>% count()
#> # A tibble: 1 x 2
#> # Groups:   interval [1]
#>   interval                           n
#>   <Interval>                     <int>
#> 1 2000-01-01 UTC--2000-01-08 UTC   360
Created on 2020-01-31 by the reprex package (v0.3.0)

It does indeed seem like a bug in the dplyr/lubridate world.
Best I can offer you today is that you can decompose the interval to its start and end points and group by that

df2 <- df %>% 
  group_by(s=int_start(interval),
           e=int_end(interval)) %>% 
  count()

Just realised you weren't asking necessarily how to group them, but for some of the 'why' of the issue.
Seems to me that Interval is being treated as representing its period / duration. there are convenience conversions as.period and as.duration which as they care only about the length of time between start and finish, will group by all your periods as though they were one period (of a week long) . I think this is a clue that dplyr group by is using this 'length of time' content of the interval to uniquely identify the variable content (which would be wrong, because your examples are unique by virtue of beginning on different dates). Hope that helps a little bit ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.