In the process of trying to pinpoint when ungroup()
is important, I realized I may not understand how group_by()
works as well as I had thought, especially when it is used in combination with summarize()
.
In particular, I didn't realize that the order of variables within group_by()
matters. It appears that after a summarize()
, only the first grouping variable remains grouped. This result was totally unintuitive to me, so I figured I'd ask-- does this make sense to everyone? or is it a bug?
For example, if I adapt the example from this thread:
library(tidyverse)
data.frame(Titanic) %>%
group_by(Class, Age) %>%
summarize(Freq = sum(Freq)) %>%
mutate(Class = reorder(Class, Freq))
#> Error in mutate_impl(.data, dots): Column `Class` can't be modified because it's a grouping variable
#when I switch the order within the group_by(), it works
data.frame(Titanic) %>%
group_by(Age, Class) %>%
summarize(Freq = sum(Freq)) %>%
mutate(Class = reorder(Class, Freq))
#> # A tibble: 8 x 3
#> # Groups: Age [2]
#> Age Class Freq
#> <fct> <fct> <dbl>
#> 1 Child 1st 6
#> 2 Child 2nd 24
#> 3 Child 3rd 79
#> 4 Child Crew 0
#> 5 Adult 1st 319
#> 6 Adult 2nd 261
#> 7 Adult 3rd 627
#> 8 Adult Crew 885