dbplyr - multiple window orderings in a single query

Hi there, I'm trying to generate an SQL query with multiple window function orderings using dbplyr.
I was trying to do this using multiple calls to the window_order function.
The default behaviour seems to be that ordering is additive.
Is there anyway to get window_order to overwrite previous orderings rather than add to them?

library(dplyr)
library(dbplyr)

df <- memdb_frame(x = 1:10, y = runif(10))

df %>% 
    window_order(x) %>% 
    mutate(z1 = cumsum(y)) %>% 
    window_order(desc(x)) %>% 
    mutate(z2 = cumsum(y)) %>% 
    show_query()
#> <SQL>
#> SELECT `x`, `y`, `z1`, SUM(`y`) OVER (ORDER BY `x`, `x` DESC ROWS UNBOUNDED PRECEDING) AS `z2`
#> FROM (SELECT `x`, `y`, SUM(`y`) OVER (ORDER BY `x` ROWS UNBOUNDED PRECEDING) AS `z1`
#> FROM `dbplyr_001`)

Is it possible to get to

#> <SQL>
#> SELECT `x`, `y`, `z1`, SUM(`y`) OVER (ORDER BY `x` DESC ROWS UNBOUNDED PRECEDING) AS `z2`
#> FROM (SELECT `x`, `y`, SUM(`y`) OVER (ORDER BY `x` ROWS UNBOUNDED PRECEDING) AS `z1`
#> FROM `dbplyr_001`)

i.e. for z1 I want ORDER BY x and for z2 I want ORDER BY x DESC (rather than ORDER BY x, x DESC)

Many thanks,
Robbie

Just to clarify, you would, then, want the results of the query you show above to be the same as that of the query without the first window_order()?

library(dplyr)
library(dbplyr)

df <- memdb_frame(x = 1:10, y = runif(10))

df %>% 
  mutate(z1 = cumsum(y)) %>% 
  window_order(desc(x)) %>% 
  mutate(z2 = cumsum(y)) %>% 
  show_query()

#> <SQL>
#> SELECT `x`, `y`, `z1`, SUM(`y`) OVER (ORDER BY `x` DESC ROWS UNBOUNDED PRECEDING) AS `z2`
#> FROM (SELECT `x`, `y`, SUM(`y`) OVER (ROWS UNBOUNDED PRECEDING) AS `z1`
#> FROM `dbplyr_001`)

Thanks for your reply Mara.

I want the first and the second calls to window_order().

I'm working data with date spells (start_date and end_date as columns).
For some operations in my pipe, I want my window ordering to be start_date, end_date and for others I want my window ordering to be start_date, end_date desc.
When I try to change my first ordering, subsequent orderings are appended.

E.g. if I do window_order(start_date, end_date) and then window_order(start_date, desc(end_date) the window ordering is:
ORDER BY start_date, end_date, start_date, end_date DESC

I can make it work if break the query up and run compute() but I was hoping to avoid this.

I think this problem is related to issue #373 for the dbplyr package.
I've left a comment documenting my problem.

Thanks

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.