How to find all possible date combinations for each group to find values' sum equal to zero

Hello!

I have a dataframe that looks like this:

`````` group    date     value
a   2020-06-01   1
a   2020-06-02   1
a   2020-06-03  -1
b   2020-06-01   2
b   2020-06-02  -1
b   2020-06-03  -1
c   2020-06-02  -3
c   2020-06-03   3
c   2020-06-04   2
``````

For each date combination I want to find the groups that the value's sum is equal to zero.
Note that I don't want to sum values across different groups.
For exemple, using this dataframe, I would like an output that returns me this information:

2020-06-01 and 2020-06-03: a
2020-06-02 and 2020-06-03: a, c
2020-06-01 and 2020-06-02 and 2020-06-03: b

``````library(tidyverse)
library(lubridate)
(df<- tribble(~group ,   ~date     ,~value,
"a",  " 2020-06-01",   1,
"a",  " 2020-06-02",   1,
"a",  " 2020-06-03",  -1,
"b",  " 2020-06-01",   2,
"b",  " 2020-06-02",  -1,
"b",  " 2020-06-03",  -1,
"c",  " 2020-06-02",  -3,
"c",  " 2020-06-03",   3,
"c",  " 2020-06-04",   2) %>% mutate(date=ymd(date)))

#date combinations
uniqs <- unique(df\$date)

combs <- map(1:length(uniqs),
~combn(x = uniqs,
m=.,
simplify = FALSE
)) %>% flatten

solutions1 <- map(seq_along(combs),
~ {filter(df,date %in% combs[[.x]]) %>% group_by(
group)  %>% summarise(value=sum(value)) %>%
filter(value==0) %>% pull(group)} %>% paste0(collapse=" & ")) %>% as.character()

combs_char <- map(combs,
~ paste(format.Date(.,"%Y-%m-%d"),collapse=" & "))
names(solutions1) <-combs_char

tidy_solutions <- enframe(solutions1)

(tidy_solutions_min <- filter(tidy_solutions,
value != ""))
# # A tibble: 6 x 2
# name                                              value
# <chr>                                             <chr>
# 1 2020-06-01 & 2020-06-03                           a
# 2 2020-06-02 & 2020-06-03                           a & c
# 3 2020-06-01 & 2020-06-02 & 2020-06-03              b & c
# 4 2020-06-01 & 2020-06-03 & 2020-06-04              a
# 5 2020-06-02 & 2020-06-03 & 2020-06-04              a
# 6 2020-06-01 & 2020-06-02 & 2020-06-03 & 2020-06-04 b``````

hmmm possibly needs more cleaning up as C is satisfied for 2020-06-02 & 2020-06-03 and
there is no value for C for 2020-06-01 which falsely allows 2020-06-01 & 2020-06-02 & 2020-06-03 to be rated for c

this change seems to do it; require the number of dates to match

``````solutions1 <- map(seq_along(combs),
~ {filter(df,date %in% combs[[.x]]) %>% group_by(
group)  %>% summarise(value=sum(value),n=n(),num_dates=length( combs[[.x]])) %>%
filter(value==0,
n==num_dates) %>% pull(group)} %>% paste0(collapse=" & ")) %>% as.character()

combs_char <- map(combs,
~ paste(format.Date(.,"%Y-%m-%d"),collapse=" & "))
names(solutions1) <-combs_char

tidy_solutions <- enframe(solutions1)

(tidy_solutions_min <- filter(tidy_solutions,
value != ""))
# A tibble: 3 x 2
name                                 value
<chr>                                <chr>
1 2020-06-01 & 2020-06-03              a
2 2020-06-02 & 2020-06-03              a & c
3 2020-06-01 & 2020-06-02 & 2020-06-03 b
``````

Awesome!
Thats's exactly what I need.

Thank you very much!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.