How to subtract dates using dbplyr

Updated with reproducible example and sample data.

Hi,

In the database, I have a column of dates (posixct). I am trying to create a new column by subtracting 12 months from the column of dates. But it seems lubridate package is not supported.

Is there a way around to achieve that result using dbplyr or another packages?

I am connecting to the database using RPostgreSQL package and use dbplyr package to run queries.

Thank you

library(tidyverse)
library(dbplyr)
library(RPostgreSQL)
library(lubridate)


Sample <- tibble::tribble(
  ~PC_pat,        ~Date, ~Minus_12_months,
       1L, "29/05/2011",     "29/05/2010",
       2L, "14/07/2010",     "14/07/2009",
       3L, "24/08/2013",     "24/08/2012",
       4L, "13/04/2011",     "13/04/2010",
       5L, "18/11/2012",     "18/11/2011"
  ) %>% 
  mutate(Date = dmy(Date), Minus_12_months = dmy(Minus_12_months)) %>% 
  select(Date)

# Connecting to PostgreSQL server

drv <- dbDriver("PostgreSQL")

username <- "username"

passwords <- "passwords"

con <- DBI::dbConnect(drv, dbname = "dbase",
                 host = "amazonaws.com", port = 5432,
                 user = username, password = passwords)

Sample <- tbl(con, "Sample")

My expected result is this but lubridate is not supported in PostgreSQL. Is there a way around to get the same result

Sample %>% 
  mutate(Minus_12_months = Date - months(12))

Could you please rephrase this as a reproducible example and provide sample data? If you never heard of it before, here is how to do it

1 Like

dbplyr translates dplyr commands into sql commands but unfortunately it can't translate other libraries commands into sql, if you are comfortable writing sql queries you can use this in PostgreSQL

query <- "select (date - interval '12 months') as minus_12_months from public.sample"
result <- dbGetQuery(con, query)

A quick comment: there are RDBMSes that have a function to add / subtract months from a date - DATEADD() in SQL server and ADD_MONTHS() in Oracle come to my mind.

These should work with dbplyr. Unfortunately the Postgres way of adding interval and a string breaks dplyr code :frowning_face:

Would something like this work?

library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
#> Warning: package 'dbplyr' was built under R version 3.5.2
df <- memdb_frame(x = runif(100), y = runif(100))
df %>%
  mutate(x = y - !!build_sql("interval '12 months'")) %>%
  show_query()
#> <SQL>
#> SELECT `y` - interval '12 months' AS `x`, `y`
#> FROM `tyziaqugiu`

Created on 2019-01-22 by the reprex package (v0.2.1)

8 Likes

You are a magician!

Yes, it works as expected, without breaking the dplyr syntax.

asdf <- asdf %>% # or what not...
   mutate(Minus_12_months = Date - !!build_sql("interval '1 years'")) %>%
   collect()
1 Like

Yay! I'm glad that worked! I hope that we'll eventually have support for a lot of these date functions, but In the meanwhile we can emulate how dbplyr puts together vector expressions by using build_sql() or sql_expr() , and force evaluation via !!.

3 Likes

This will not be easy, as many of the functions on backend databases are very platform specific.

In the meantime I should brush up my bang bang skills.

2 Likes

One more trick is to use the fact that infix operators will be translated into SQL directly. It allows for slightly (IMHO) less busy code that can be even converted into small DSL under the hood:

library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
df <- memdb_frame(x = runif(100), y = runif(100))
df %>%
  mutate(x = y %- interval% '12 months') %>%
  show_query()
#> <SQL>
#> SELECT `y` - INTERVAL '12 months' AS `x`, `y`
#> FROM `pfjhwiitpd`

Created on 2019-01-22 by the reprex package (v0.2.1)

5 Likes

This is so great! Thank you all for your inputs.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.