Consider each factor in group_by

Hi ! My first question here :slight_smile:

Actually i have created these factors :

 df$period <- cut(df$temps_diffusion, c(0, 7, 14, 30, 90, 180, 360, Inf),
                                            labels = c("j+7", "j+14", "j+30", "j+90", "j+180","j+360", "j+inf"), right = FALSE)

And i made a group_by on these factors

df= df%>% group_by(idproduit,datediffusion,period) %>%  summarise(., across(where(is.numeric), sum))  

This is a part of my result :

  idproduit datediffusion period   vues_hors_robots
1   6539665    2022-10-06                     j+7                                279                 
2   6539665    2022-10-06                    j+14                              351    
3   6539665    2022-10-06                    j+30                              399 

(There is just 3 rows of result)
I would know if it's possible to let each modality i created even if they have no data.
In order to have this result :

  idproduit datediffusion period vues_hors_robots 
1   6539665    2022-10-06                     j+7                               279                 
2   6539665    2022-10-06                    j+14                              351                 
3   6539665    2022-10-06                    j+30                              399                 
4   6539665    2022-10-06                    j+90                              NA
5   6539665    2022-10-06                    j+180                            NA
6   6539665    2022-10-06                    j+360                            NA
......

Thanks for your help

There is probably a quicker way, however:

library(tidyverse)

df <- tibble(
  idproduct = c("A", "B", "C", "D", "E"),
  datediffusion = as.Date("2022-10-06"),
  period = c("j1", "j4", "j3", "j4", "j4"),
  value = c(4,7,3,5,9)
)


A tibble: 5 × 4
  idproduct datediffusion period value
  <chr>     <date>        <chr>  <dbl>
1 A         2022-10-06    j1         4
2 B         2022-10-06    j4         7
3 C         2022-10-06    j3         3
4 D         2022-10-06    j4         5
5 E         2022-10-06    j4         9

I create 2 vectors and a dataframe with all possible combinations

periodlevels <- c("j1", "j2", "j3", "j4", "j5") # contains all levels
ids <- unique(df$idproduct)

all <- tibble(
  idproduct = rep(ids, times=length(periodlevels)),
  period = rep(periodlevels, each=length(periodlevels))
)

I joined the two dataframes

df2 <- full_join(df, all, by="idproduct") %>% 
  mutate(value = ifelse(period.x == period.y, value, NA_integer_))

calculate the result

df2 %>% 
  group_by(idproduct, datediffusion, period = period.y) %>% 
  summarise(., across(where(is.numeric), sum))



# A tibble: 25 × 4
# Groups:   idproduct, datediffusion [5]
   idproduct datediffusion period value
   <chr>     <date>        <chr>  <dbl>
 1 A         2022-10-06    j1         4
 2 A         2022-10-06    j2        NA
 3 A         2022-10-06    j3        NA
 4 A         2022-10-06    j4        NA
 5 A         2022-10-06    j5        NA
 6 B         2022-10-06    j1        NA
 7 B         2022-10-06    j2        NA
 8 B         2022-10-06    j3        NA
 9 B         2022-10-06    j4         7
10 B         2022-10-06    j5        NA
11 C         2022-10-06    j1        NA
12 C         2022-10-06    j2        NA
13 C         2022-10-06    j3         3
14 C         2022-10-06    j4        NA
15 C         2022-10-06    j5        NA
16 D         2022-10-06    j1        NA
17 D         2022-10-06    j2        NA
18 D         2022-10-06    j3        NA
19 D         2022-10-06    j4         5
20 D         2022-10-06    j5        NA
21 E         2022-10-06    j1        NA
22 E         2022-10-06    j2        NA
23 E         2022-10-06    j3        NA
24 E         2022-10-06    j4         9
25 E         2022-10-06    j5        NA
library(tidyverse)
iris %>% filter(Species != "setosa") %>%
  group_by(Species,.drop = FALSE) %>%
  summarise(avg_petal_length = mean(Petal.Length))
# A tibble: 3 x 2
  Species    avg_petal_length
  <fct>                 <dbl>
1 setosa               NaN   
2 versicolor             4.26
3 virginica              5.55

Thank you very much for your answers

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.