Why is it "arrange" and not "sort" or "sort_by"?

I love dplyr::arrange(), and I use it all the time, but is there a reason why it's called arrange() and not something more... relevant? I feel like this has to have been talked about at length somewhere.

I feel like most dplyr verbs are consistent with the verbs my brain thinks of for the tasks (selecting variables, filtering rows, etc.), but arrange is definitely not the verb that first comes to mind for sorting.

Obviously as an outside party I can’t speak to the developers’ actual train of thought, but I think the other sorting words were already taken? E.g., base::sort(), base::order(), and stats::reorder() already exist as very commonly used functions. Clobbering any of those with a necessarily substantially different dplyr function would not be very user friendly!

Where dplyr does share function names with R’s core, dplyr’s versions seem to do things like extend a generic to a new data type (as in lead()/lag()) or make something generic that wasn’t before (as in the set operations). So basically, effort seems to have been made to be a good citizen.

4 Likes

Sort implies cardinality, objects have some natural order that you want to restore. Arrange simply implies a user-specified order that can be completely arbitrary. Three cheers for anarchy.

4 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.