Advice on a cleaner way to control the sort direction and filter inside a pipeline

I have several calculated values, for which I would like to extract
the "most interesting groups". Depending on the value, that will either be
the largest values, or the smallest of the ending values. To do this I
sort (by group) the data frame using the chosen column, and use a
function to control the sort direction. Then, similarly, I use a function
to select the top few groups.

In the real app I have a large table of counties with various measures (new
cases, doubling time, etc) for Covid as functions of date. I want to extract
the current 5 worst counties for new cases, doubling time, etc.

This all feels cumbersome and has proven to also be error-prone. Note
that this lives in a shiny app. Trying to debug a function being called
in a pipe inside shiny is pretty painful.

Is there a better pattern for being able to alter the direction of the
sort based on the value of a variable?


library(tidyverse)

#########   set up reprex
ascending <- c("a", "b")

df <-  tribble(~names, ~a, ~b, ~c, ~d,
               "A",  1,  2,  3,  4,
               "B",  5,  6,  7,  8,
               "C",  9, 10, 11, 12,
               "A",  9, 10, 11, 12,
               "B",  1,  2,  3,  4,
               "C",  5,  6,  7,  8,
               "A", 13, 14, 15, 16,
               "B",  9, 10, 11, 12,
               "C", 13, 14, 15, 16)

chosen_column <- c("a","c")


for (i in 1:2) {# test both sorts
########   Following is the bit that seems awkward

  asc <- chosen_column[i] %in% ascending
    
  do_sort <- function(df, asc) {
    if (asc){
      dplyr::arrange(df, Mselect)
    } else {
      dplyr::arrange(df, desc(Mselect))
    }
  }
    do_filter <- function(df, asc) {
    if (asc){
      dplyr::filter(df, Mselect<=(unique(Mselect)[1]))
    } else { 
      dplyr::filter(df, Mselect>=(unique(Mselect)[1]))
      }
  }

  foo <- df %>% 
    group_by(names) %>% 
      # Store last value by group for sorting and selection 
      mutate(Mselect=last(!!as.name(chosen_column[i]))) %>% 
    ungroup() %>% 
      #  sort chosen column in proper direction 
    do_sort(asc) %>% 
      #  extract top few values
    do_filter(asc) %>% 
    select(-Mselect) 
    
    print(foo)
}

You can apply a multiplier to your Mselect variable to be either positive or negative depending on your asc variable. This produces the same results as above and is much briefer.

for (i in 1:2) {
  
  asc <- chosen_column[i] %in% ascending
  sort_direction <- 2*asc-1
  
  foo2 <- df %>% 
    group_by(names) %>% 
    mutate(Mselect=sort_direction*last(!!as.name(chosen_column[i]))) %>% 
    ungroup() %>% 
    arrange(Mselect) %>%
    filter(Mselect <= Mselect[1]) %>%
    select(-Mselect) 
  
  print(foo2)
  
}

Ahhh yes, Clever. Thank you!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.