Hello, I have a tibble with an empty factor level and I am trying to use summarize()
but preserve all factor levels. Normally I would just use group_by()
and .drop = FALSE
, but is there anyway of avoiding group_by()
achieving this with .by
and summarize()
?
library(tidyverse)
health <- tibble(
name = c("Ikaia", "Oletta", "Leriah", "Dashay", "Tresaun"),
smoker = factor(c("no", "no", "no", "no", "no"), levels = c("yes", "no")),
age = c(34, 88, 75, 47, 56),
)
health |>
group_by(smoker, .drop = FALSE) |>
summarize(
n = n(),
mean_age = mean(age),
min_age = min(age),
max_age = max(age),
sd_age = sd(age)
)
#> Warning: There were 2 warnings in `summarize()`.
#> The first warning was:
#> ℹ In argument: `min_age = min(age)`.
#> ℹ In group 1: `smoker = yes`.
#> Caused by warning in `min()`:
#> ! no non-missing arguments to min; returning Inf
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
#> # A tibble: 2 × 6
#> smoker n mean_age min_age max_age sd_age
#> <fct> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 yes 0 NaN Inf -Inf NA
#> 2 no 5 60 34 88 21.6
health |>
summarize(
n = n(),
mean_age = mean(age),
min_age = min(age),
max_age = max(age),
sd_age = sd(age), .by = "smoker")
#> # A tibble: 1 × 6
#> smoker n mean_age min_age max_age sd_age
#> <fct> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 no 5 60 34 88 21.6
Created on 2023-06-18 with reprex v2.0.2