Adding translation for new dplyr backends

dplyr
dbplyr
databases

#1

I’m working with Teradata Databases at the moment and I would like to develop a dplyr backend for it. I’m already using odbc package for making queries with the database and it works great, but dplyr sql translations fails:

a <- tbl(con, "mytable")
a
Error in new_result(connection@ptr, statement) : 
  nanodbc/nanodbc.cpp:1344: 42000: [Teradata][ODBC Teradata Driver][Teradata Database] Syntax error: expected something between the word 'xbvfzvopfm' and the 'LIMIT' keyword.

So I think I only need to implement translations methods for the odbc DBI implementation. What’s the best starting point for this?


#2

I found that this was a good starting point: https://github.com/tidyverse/dbplyr/blob/master/R/db-odbc-oracle.R

Finally, the problem was the same as in oracle… LIMIT is called SAMPLE in Teradata databases.

So implementing a function like this worked:

sql_select.Teradata<- function(con, select, from, where = NULL,
                             group_by = NULL, having = NULL,
                             order_by = NULL,
                             limit = NULL,
                             distinct = FALSE,
                             ...) {
  out <- vector("list", 7)
  names(out) <- c("select", "from", "where", "group_by", "having", "order_by",
                  "limit")
  
  out$select    <- dbplyr:::sql_clause_select(select, con, distinct)
  out$from      <- dbplyr:::sql_clause_from(from, con)
  out$where     <- dbplyr:::sql_clause_where(where, con)
  out$group_by  <- dbplyr:::sql_clause_group_by(group_by, con)
  out$having    <- dbplyr:::sql_clause_having(having, con)
  out$order_by  <- dbplyr:::sql_clause_order_by(order_by, con)
  
  # Using Sample instead of limit
  if (!is.null(limit) && !identical(limit, Inf)) {
    assertthat::assert_that(is.numeric(limit), length(limit) == 1L, limit > 0)
    out$limit <- build_sql(
      "SAMPLE ", sql(format(trunc(limit), scientific = FALSE)),
      con = con
    )
  }
  
  escape(unname(dbplyr:::compact(out)), collapse = "\n", parens = FALSE, con = con)
}

And now the printing method for teradata tables works. Should I submit this as PR for dbplyr?


#3

It seems you already found the link to the open issue on github.
I just put it there to know it exists and it’s related.

https://github.com/tidyverse/dplyr/issues/3040