group-summarise-ungroup routine shorter

Is there a way to write a sequence group - summarise - ungroup shorter? The only thing I met is count() that is applicable for some specific cases only.
For example, Python pandas users have "pd.pivot()" and "pd.pivot_table()"

Hi @gray!

I think dplyr encourages you to write your own functions in this case, rather than trying to have more summarizations built-in. Doing this sort of thing (minus the ungroup(), but that's easy to add) is sort of a canonical example of programming with dplyr. For instance, the latest latest, as discussed here:

max_by <- function(data, var, by) {
  data %>%
    group_by({{ by }}) %>%
    summarise(maximum = max({{ var }}, na.rm = TRUE))
}

starwars %>% max_by(height)
#> # A tibble: 1 x 1
#>   maximum
#>     <int>
#> 1     264

starwars %>% max_by(height, by = gender)
#> # A tibble: 5 x 2
#>   gender        maximum
#>   <chr>           <int>
#> 1 <NA>              167
#> 2 female            213
#> 3 hermaphrodite     175
#> 4 male              264
#> 5 none              200

For more up-to-the-minute resources along these lines, see here:

One of the resources linked in that discussion is the in-development tidyeval book, which I understand will be replacing the "Programming with dplyr" vignette once it is complete:

https://tidyeval.tidyverse.org/index.html

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.