Tricky behavior with grouped summarize sum for groups with only NAs

Hi,

I know that sum(NA, na.rm = TRUE) results in 0. However, I wouldn't necessarily want this 0 to appear in a summarise() output. Please see the minimal example below.

tbl <- tibble(id = c(1,2), values = c(NA, 5))

tbl %>% group_by(id) %>% summarise(tot = sum(values, na.rm = TRUE))
#> # A tibble: 2 × 2
#>      id   tot
#>   <dbl> <dbl>
#> 1     1     0
#> 2     2     5

Subsequent calculations using the tot column would be wrong (e.g., mean of tot). I know this is a symptom of the sum() function, but was wondering if there is a "safer" way of doing such summarize? I suppose one approach is an initial test, e.g.,

tbl %>% group_by(id) %>% summarise(tot = if_else(all(is.na(values)), NA, sum(values, na.rm = TRUE)))

Are there better ways to avoid this potential pitfall? Thank you in advance for any feedback and guidance.

The mean function has an na.rm option as well.

> tbl <- tibble(id = c(1,2), values = c(NA, 5))
> 
> tbl %>%
>   group_by(id) %>% summarise(
>     tot_sum = sum(values, na.rm = TRUE),
>     tot_mean = mean(values, na.rm = TRUE)
>   )

# A tibble: 2 x 3
     id tot_sum tot_mean
  <dbl>   <dbl>    <dbl>
1     1       0      NaN
2     2       5        5

for accurate downstream calculations, simple use the NA , as there purpose of existence is to work in the way you would require.

Displaying nicely formatted results for users should happen downstream of your calculations; you can decide on appropriate way to print the NA info depending on that.