For my final thesis I am struggling with the following:
The topic concerns flight delay between medium-sized airport hubs in the US. The total file contains 1.7 million observations. The picture below shows a few observations:
ORIGIN = Airport code for origin
DEST = Airport code for destination
I want to aggregate the assigned ARR_DELAY if the ORIGIN-DEST hub is the same. This goes on for thousands of combinations. I'm struggling to find a code for this, does anyone have a solution?
so for instance: combining the ARR_DELAY of the combination AUS-BNA.
It is much appreciated.
Best regards, Joris