Is ungroup() recommended after every group_by()?

group_by adds metadata to a data.frame that marks how rows should be grouped. As long as that metadata is there you won't be able to change the factors of the columns involved in the grouping. See the following examples.

You should use a reproducible example for your code. See:

https://www.jessemaegan.com/post/so-you-ve-been-asked-to-make-a-reprex

As is with your code it isn't possible to tell is you meant to use plyr::summarize or dplyr::summarize.

Also a reprex makes it possible for us to just copy paste you code and be able to run it in the same environment that you did. Everyone here is answering questions on their own time so we ask that you do what you can to minimize that time... a reprex is the best way to do that.

suppressPackageStartupMessages(library(dplyr))

# first of all dplyr::group_by adds meta-data to
# the data.frame that other functions, like 
# dplry::summaraize use when the do calculations

t1 <- data.frame(Titanic) %>%
   group_by(Class, Age)

# notice that the meta-data show how rows
# should be grouped
str(t1)
#> Classes 'grouped_df', 'tbl_df', 'tbl' and 'data.frame':  32 obs. of  5 variables:
#>  $ Class   : Factor w/ 4 levels "1st","2nd","3rd",..: 1 2 3 4 1 2 3 4 1 2 ...
#>  $ Sex     : Factor w/ 2 levels "Male","Female": 1 1 1 1 2 2 2 2 1 1 ...
#>  $ Age     : Factor w/ 2 levels "Child","Adult": 1 1 1 1 1 1 1 1 2 2 ...
#>  $ Survived: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ Freq    : num  0 0 35 0 0 0 17 0 118 154 ...
#>  - attr(*, "vars")= chr  "Class" "Age"
#>  - attr(*, "drop")= logi TRUE
#>  - attr(*, "indices")=List of 8
#>   ..$ : int  0 4 16 20
#>   ..$ : int  8 12 24 28
#>   ..$ : int  1 5 17 21
#>   ..$ : int  9 13 25 29
#>   ..$ : int  2 6 18 22
#>   ..$ : int  10 14 26 30
#>   ..$ : int  3 7 19 23
#>   ..$ : int  11 15 27 31
#>  - attr(*, "group_sizes")= int  4 4 4 4 4 4 4 4
#>  - attr(*, "biggest_group_size")= int 4
#>  - attr(*, "labels")='data.frame':   8 obs. of  2 variables:
#>   ..$ Class: Factor w/ 4 levels "1st","2nd","3rd",..: 1 1 2 2 3 3 4 4
#>   ..$ Age  : Factor w/ 2 levels "Child","Adult": 1 2 1 2 1 2 1 2
#>   ..- attr(*, "vars")= chr  "Class" "Age"
#>   ..- attr(*, "drop")= logi TRUE

Created on 2018-02-16 by the reprex package (v0.2.0).

suppressPackageStartupMessages(library(dplyr))

# dplyr::summerize passes along that information 
t2 <- data.frame(Titanic) %>% 
    group_by(Class, Age) %>% 
    summarize(Freq = sum(Freq))
t2
#> # A tibble: 8 x 3
#> # Groups:   Class [?]
#>   Class Age     Freq
#>   <fct> <fct>  <dbl>
#> 1 1st   Child   6.00
#> 2 1st   Adult 319   
#> 3 2nd   Child  24.0 
#> 4 2nd   Adult 261   
#> 5 3rd   Child  79.0 
#> 6 3rd   Adult 627   
#> 7 Crew  Child   0   
#> 8 Crew  Adult 885

str(t2)
#> Classes 'grouped_df', 'tbl_df', 'tbl' and 'data.frame':  8 obs. of  3 variables:
#>  $ Class: Factor w/ 4 levels "1st","2nd","3rd",..: 1 1 2 2 3 3 4 4
#>  $ Age  : Factor w/ 2 levels "Child","Adult": 1 2 1 2 1 2 1 2
#>  $ Freq : num  6 319 24 261 79 627 0 885
#>  - attr(*, "vars")= chr "Class"
#>  - attr(*, "drop")= logi TRUE

Created on 2018-02-16 by the reprex package (v0.2.0).

# the following fails because mutate is trying
# change one of the columns used by group_by
# and it can see that because of the meta-data
# passed through by dplyr::summarize
suppressPackageStartupMessages(library(dplyr))
t3 <- data.frame(Titanic) %>% 
    group_by(Class, Age) %>% 
    summarize(Freq = sum(Freq)) %>%
    mutate(Class = reorder(Class, Freq))
#> Error in mutate_impl(.data, dots): Column `Class` can't be modified because it's a grouping variable

Created on 2018-02-16 by the reprex package (v0.2.0).

# ungroup removes any grouping meta-data so
suppressPackageStartupMessages(library(dplyr))
t4 <- data.frame(Titanic) %>% 
    group_by(Class, Age) %>% 
    ungroup()

# notice there is no grouping meta-data in t4
str(t4)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    32 obs. of  5 variables:
#>  $ Class   : Factor w/ 4 levels "1st","2nd","3rd",..: 1 2 3 4 1 2 3 4 1 2 ...
#>  $ Sex     : Factor w/ 2 levels "Male","Female": 1 1 1 1 2 2 2 2 1 1 ...
#>  $ Age     : Factor w/ 2 levels "Child","Adult": 1 1 1 1 1 1 1 1 2 2 ...
#>  $ Survived: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ Freq    : num  0 0 35 0 0 0 17 0 118 154 ...

Created on 2018-02-16 by the reprex package (v0.2.0).

suppressPackageStartupMessages(library(dplyr))

# so by ungroup before running mutate
# lets the factors be changed
suppressPackageStartupMessages(library(dplyr))
t5 <- data.frame(Titanic) %>% 
    group_by(Class, Age) %>% 
    summarize(Freq = sum(Freq)) %>%
    ungroup() %>%
    mutate(Class = reorder(Class, Freq))

str(t5)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    8 obs. of  3 variables:
#>  $ Class: Factor w/ 4 levels "2nd","1st","3rd",..: 2 2 1 1 3 3 4 4
#>   ..- attr(*, "scores")= num [1:4(1d)] 162 142 353 442
#>   .. ..- attr(*, "dimnames")=List of 1
#>   .. .. ..$ : chr  "1st" "2nd" "3rd" "Crew"
#>  $ Age  : Factor w/ 2 levels "Child","Adult": 1 2 1 2 1 2 1 2
#>  $ Freq : num  6 319 24 261 79 627 0 885

Created on 2018-02-16 by the reprex package (v0.2.0).

2 Likes