Aggregating values of one column based on the characters of two other columns

Hello everyone,

For my final thesis I am struggling with the following:

The topic concerns flight delay between medium-sized airport hubs in the US. The total file contains 1.7 million observations. The picture below shows a few observations:

Capture

ORIGIN = Airport code for origin
DEST = Airport code for destination

I want to aggregate the assigned ARR_DELAY if the ORIGIN-DEST hub is the same. This goes on for thousands of combinations. I'm struggling to find a code for this, does anyone have a solution?

so for instance: combining the ARR_DELAY of the combination AUS-BNA.

It is much appreciated.

Best regards, Joris

library(dplyr)

flights %>% 
  group_by(ORIGIN, DEST) %>% 
  summarise(sum_delay = sum(ARR_DELAY))

Thank you for your quick reply!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.