How to give dplyr a SQL query and have it return a remote tbl object?

Say I have a remote tbl open using dbplyr, and I want to use a SQL query on it (maybe because there's not dbplyr translation for what I want to do), how do I give it such that it returns a remote tbl object?

The DBI::dbGetQuery() function allows you to give a query to db, but it returns a data frame on memory, and not an remote tbl object.

For example, say you already have a connection con open to a db, you can create a table like this:

library(tidyverse)

x_df <- expand.grid(A = c('a','b','c'), B = c('d','e','f', 'g','h')) %>% 
  mutate(C = round(rnorm(15), 2))

DBI::dbWriteTable(conn = con,
                  name = "x_tbl",
                  value = x_df,
                  overwrite = TRUE)

x_tbl = tbl(con, 'x_tbl')

sql_query <- build_sql('SELECT a, b, c, avg(c) OVER (PARTITION BY a) AS mean_c FROM x_tbl')

y_df <- DBI::dbGetQuery(con, sql_query) # This returns a data frame on memory

y_tbl <- x_tbl %>% 
  group_by(a) %>% 
  mutate(mean_c = mean(c))

show_query(y_tbl) # This is a remote tbl object

In this case, I could just use y_tbl . But there are cases in which the function has not been translated on dbplyr (for example, quantile doesn't work), and I need to use SQL code. But I don't want to collect the result, I want it to create a remote tbl object. Is there a way I can give a SQL query (like with dbGetQuery() ) but have it return a remote tbl?

Thank you

I'd say try DBI::dbExecute() with the SQL prepended with "CREATE TABLE tablename AS ", then dply::tbl(db, "tablename") should give you a reference to the remote table.

About that, in case you don't know, you can provide some SQL functions directly through dplyr verbs. If there is no known translation there are passing to SQL as is. You can also use SQL infix operator using %<operator>%: || will be translated as SQL OR and %||% will be pass as || to SQL. This can be very useful to pass advanced SQL in dplyr piped worflow. Also dplyr::sql() can be useful to pass raw sql.

Some documentation

All other functions will be preserved as is. R's infix functions (e.g. %like% ) will be converted to their SQL equivalents (e.g. LIKE ). You can use this to access SQL string concatenation: || is mapped to OR , but %||% is mapped to || . To suppress this behaviour, and force errors immediately when dplyr doesn't know how to translate a function it encounters, using set the dplyr.strict_sql option to TRUE .
You can also use sql() to insert a raw sql string.

About the last part, i think something like

tab <- tbl(con, sql(...))

could work.

For this kind of mix between SQL queries and R object, know that Rmarkdown and R notebook can be useful as you could mixed SQL chunk and R chunk, getting the SQL result as an R object directly. See:
https://db.rstudio.com/getting-started/database-queries#query-using-an-r-notebooks

4 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.