Is there a way to write a sequence group
- summarise
- ungroup
shorter? The only thing I met is count()
that is applicable for some specific cases only.
For example, Python pandas users have "pd.pivot()" and "pd.pivot_table()"
Hi @gray!
I think dplyr
encourages you to write your own functions in this case, rather than trying to have more summarizations built-in. Doing this sort of thing (minus the ungroup()
, but that's easy to add) is sort of a canonical example of programming with dplyr
. For instance, the latest latest, as discussed here:
max_by <- function(data, var, by) {
data %>%
group_by({{ by }}) %>%
summarise(maximum = max({{ var }}, na.rm = TRUE))
}
starwars %>% max_by(height)
#> # A tibble: 1 x 1
#> maximum
#> <int>
#> 1 264
starwars %>% max_by(height, by = gender)
#> # A tibble: 5 x 2
#> gender maximum
#> <chr> <int>
#> 1 <NA> 167
#> 2 female 213
#> 3 hermaphrodite 175
#> 4 male 264
#> 5 none 200
For more up-to-the-minute resources along these lines, see here:
One of the resources linked in that discussion is the in-development tidyeval
book, which I understand will be replacing the "Programming with dplyr
" vignette once it is complete:
1 Like
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.