I know that
sum(NA, na.rm = TRUE) results in 0. However, I wouldn't necessarily want this 0 to appear in a summarise() output. Please see the minimal example below.
tbl <- tibble(id = c(1,2), values = c(NA, 5)) tbl %>% group_by(id) %>% summarise(tot = sum(values, na.rm = TRUE)) #> # A tibble: 2 × 2 #> id tot #> <dbl> <dbl> #> 1 1 0 #> 2 2 5
Subsequent calculations using the
tot column would be wrong (e.g., mean of
tot). I know this is a symptom of the
sum() function, but was wondering if there is a "safer" way of doing such summarize? I suppose one approach is an initial test, e.g.,
tbl %>% group_by(id) %>% summarise(tot = if_else(all(is.na(values)), NA, sum(values, na.rm = TRUE)))
Are there better ways to avoid this potential pitfall? Thank you in advance for any feedback and guidance.