Pass orderby to dplyr arrange

Intent: Capture arrange specification in a variable and pass to dplyr::arrange.

Here is a function signature I am trying to implement:

use_orderby_spec(df, orderby_spec, ...)

Note that ... is reserved for something else.
Here is the expected usage,

use_orderby_spec(iris, c(Species, desc(Sepal.Width)))

which essentially should translate to

dplyr::arrange( iris, Species, desc(Sepal.Width) )

Observations:

  1. Solutions seems to revolve around using .... In my case, ... is used for something else.
  2. pick seems to be helpful cases where arrange is column names as expressions and not the ones with desc(colname).

if you accept the user passing the instructions for sorting inside of a dplyr::vars then this would work


use_orderby_spec <- function(df,orderby_spec , ...){
  stopifnot(require(tidyverse))
  stopifnot(require(rlang))

  dots <- dots_list(...)
  walk(dots,print)
  
 arrange(.data = df,
         !!!orderby_spec)
}
use_orderby_spec(iris,orderby_spec = vars(Species,
                                desc(Sepal.Width)), 
                 myother_parms="testing") |> head()

Thanks for the solution @nirgrahamuk
vars is superseded, not sure if it is a good practice to use it.
It would be great to have the orderby_spec inside c like c(Species, desc(Sepal.Width))

The fact that that doesnt work in a normal arrange, is a bad sign.

The superceded tag does not disuade me, it just points the tidyverse developers not realising there was still juice on vars; Which is a thin wrapper around rlang::quos

And a thank you from me @nirgrahamuk. I had this as a side project.

I had wanted to pass in 2 parameters to a function. Each a list of columns and one with the ability to determine sort order.

I don't have test data for a reprex but the gist of it below. DeDup gives unique rows based on indexes (IDs/keys) by sorting based on the order specification in orderfields
The deduplication method is the standard sort and take the first row from each group of identical rows by the ID field.
Example below will take the latest update date with ties decided by create date

DeDup <-  function(dfin , orderfields, indexfields ) {
  
  dfout <- dfin %>%
    arrange(!!!indexfields, !!!orderfields) %>%
    distinct(!!!indexfields, .keep_all = TRUE)
}

ExcDups <-  DeDup(IncDups,  
           vars(desc(UpdateDate), desc(CreateDate) ), 
           vars(ID1, ID2))
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.