Need to get a colsum with respect to another col

Hey all,
So I'm working working with a package called nycflights13, and working in flights data set. There is a variable called air_time that has a corresponding column of data, and empty entries are cancelled flights, which are NA's. There is another column called month which is 1-12 for the which month it is.

What I want to do is find out which month has the most cancelled flights, ie which month has the most NA's.

This is what I have tried so far:
x <- select(flights, air_time)
colSums(is.na(x))

That returns a sum of 9430 but I don't know how to get the sums that correspond to specific values of the month column.

library(nycflights13)

(x <- select(flights, month, air_time) %>%
  group_by(month) %>%
  summarise(air_time_na = sum(is.na(air_time)),
            air_time_total = n(),
            air_time_na_pcnt = 100*air_time_na/air_time_total))

Thank you so much! There is just one thing I don't understand, what does %>% do?

its a magrittr / tidyverse pipe function
it takes the the object on the left and passes it as first argument to the function on the right of it

iris  %>% head()
head(iris)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.