Datetime Inner Join with Date - dbplyr - Microsoft SQL Server


#1

I want to inner join database table A with field DateTime (DD/MM/YYYY HH/MM/SS) and table B with field Date (DD/MM/YYYY).

a <- tbl(con, 'a')
b <- tbl(con, 'b')

ab_joined <- a %>%
                inner_join(b, by = c('DateTime' = 'Date'))

Currently, my join only gives me records where DateTime has HH/MM/SS of 00:00:00, so I am missing a significant amount.

How can I convert the DateTime to a simple Date field in-database using dbplyr as opposed to using dbGetQuery() and bringing the data in-memory?


#2

Hi @djslp, I would coerce them both to Dates, as.Date


#3

What would this look like? Here is what I tried and the following errors I received:

a %>%
  inner_join(x = a, y = b, by = c(as.Date('DateTime' = 'Date'))) %>%
  tally()

#> Error in charToDate(x) : argument "x" is missing, with no default
a %>%
  inner_join(x = a, y = b, by = c(as.Date('DateTime') = as.Date('Date'))) %>%
  tally()

#> Error: unexpected '=' in:"a %>% inner_join(x = a, y = b, by = c(as.Date('DateTime') ="
a %>%
  inner_join(x = a, y = b, by = as.Date(c('DateTime' = 'Date'))) %>%
  tally()

#> Error in charToDate(x) : character string is not in a standard unambiguous format

So far the only query that has worked does not return all of my records due to 'Date' having the timestamp of 00:00:00, which returns significantly less records due to the inner join.


#4

After some more troubleshooting, I found a solution:

a %>%
    mutate(newDate = as.Date(DateTime)) %>%
    inner_join(b, by = c('newDate' = 'Date'))

A tally count confirmed the correct amount of rows returning.