Summarise all numerical values

Hi,
I have this simple df with different types of variables:

source <- data.frame(
          stringsAsFactors = FALSE,
               check.names = FALSE,
             AccountNumber = c(1, 2, 3, 4, 5),
                   RegDate = c("2022-11-18",
                               "2022-10-29","2022-10-28","2022-10-28",
                               "2022-09-30"),
                   Mileage = c(1069, 985, 75, 238, 2133),
          LastWorkshopDate = c("2022-12-22",
                               "2022-12-21","2022-11-10","2022-11-23",
                               "2022-11-24"),
                       CWI = c("C", "W", "W", "C", "W"),
                   RoTotal = c(162, 39.67, 38.89, 38.89, 38.89),
                `Vehicle Age` = c(34, 53, 13, 26, 55),
                  `Last Seen` = c(NA, 20, NA, NA, 40),
           `Not Seen Flag` = c(1, 1, 0, 1, 0),
                       Year = c(2021, 2022, 2022, 2022, 2022),
                     Month = c(12, 12, 11, 11, 11)
)

Now, I need to calculate means and counts for all numerical variables.
I don't want to list them like here:

library(dplyr)

result <- source %>%
  mutate(Year=as.character(Year)) %>%
  bind_rows(mutate(.data = .,
                   CWI = "Total")) %>%
  group_by(Year, CWI) %>%
  summarise_at(.vars = vars(ends_with(match = "Mileage"), ends_with(match = "RoTotal")),.funs = list(Sc = ~mean(.,na.rm=TRUE), Count = ~sum(!is.na(.))))

result

Is any way of specifying I need means for all available numerical variables apart from Month and Year?

You made Year character, so if we only calculate means on numerics that is naturally avoided; with Month being numeric though it should be suppressed. You can use across()

... your other code .. %>% 
summarise(across(where(is.numeric) & !Month,
                   .fns = list(Sc = ~mean(.,na.rm=TRUE),
                                Count = ~sum(!is.na(.)))))

Note; this post was edit so that .funs became .fns

Thank you but something is not right. I always have number of rows the same as number of records, regardless my grouping for example:

result <- source %>%
  mutate(Year=as.character(Year)) %>%
  bind_rows(mutate(.data = .,
                   Year = "Total")) %>%
    group_by(Year) %>%
  summarise(across(where(is.numeric) & !Month,
                   .funs = list(Sc = ~mean(.,na.rm=TRUE),
                                Count = ~sum(!is.na(.)))))

result

In this case I should have three rows (2021, 2022 and Total) with averages of all numerical values for these two years

you are right, when I edited your code into mine, I missed that the old .funs param should be .fns now ; try that

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.