Nice, I woudl also like to see if there's any code or package to do this. Just for the specific case of quantile, I think you can use percentile_cont
(available since Postgres 9.4) to compute quantile, not sure if it's as precise as the R version of quantile though.
library(tidyverse)
library(dbplyr)
library(RPostgres)
library(DBI)
con <- dbConnect(RPostgres::Postgres(), dbname = "testdb")
copy_to(con, rownames_to_column(mtcars), name = "mtcars")
req <- "SELECT PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY mpg) AS q1 FROM mtcars"
tbl(con, sql(req))
## q1
## <dbl>
## 1 15.4
mtcars %>%
summarise(q1 = quantile(mpg, probs = 0.25))
## q1
## 1 15.425
On a side note, it would be great if someone can point me to mecanism to add specific aggregate function to an already existing DBI backend. Using the help from sql_variant
, I came up with this
pg_quantile <- function(x, probs, ...) {
stopifnot(length(probs) == 1) ## 1 parameters supported
sql(str_glue("PERCENTILE_CONT({probs}) WITHIN GROUP (ORDER BY {x})", x = x, probs = probs))
}
postgres_agg <- sql_translator(.parent = base_agg,
quantile = pg_quantile)
postgres_var <- sql_variant(
base_scalar,
postgres_agg,
base_no_win
)
conn <- structure(
list(),
class = c("TestCon", "DBITestConnection", "DBIConnection")
)
sql_translate_env.TestCon <- function(x) postgres_var
translate_sql(quantile(x, probs = 0.5), con = conn, window = FALSE)
## <SQL> PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY `x`)
It will be great to have a way to use this new translation directly with dplyr, any doc available ?
Thanks