Generate data.frame from sql-script (in Rstudio 1.2)

sql

#1

Suppose I have the following SQL statement that I wish to save as a data.frame in R.

  select "month_idx", "year", "month",
  sum(case when "term_deposit" = 'Yes' then 1.0 else 0.0 end) as subscribe,
  count(*) as total
  from "bank"
  group by "month_idx", "year", "month"

Using an R-script this is easy, just wrap the code with dbGetQuery().

However, with the preview release of Rstudio functionality has been added so we can write SQL-scripts in SQL-files with code highlightning, autocomplete etc. So my workflow (which involves a lot of SQL) is this: I type the SQL-code in a separate SQL-script in RStudio, much like I did with Visual Studio before, then I copy the SQL-code to an R-script that uses dbGetQuery() and saves it as a data.frame.

But I am OBSESSED with elminiating copy & paste actions in my workflow. Is there a way to save a query from SQL-scripts to a data.frame in RStudio?

In my head it would be something like:

-- !preview conn=con, output.var = "df"
  select "month_idx", "year", "month",
  sum(case when "term_deposit" = 'Yes' then 1.0 else 0.0 end) as subscribe,
  count(*) as total
  from "bank"
  group by "month_idx", "year", "month"

But that doesn't work. Anybody know if it's possible in any other way?

This is example code from the RStudio database documentation site


#2

My admittedly old snippet is

library(tidyverse)
library(DBI)
drv <- dbDriver("MySQL")
con <- dbConnect(drv, username="root", password="", dbname ="dlf", host="localhost")
res <- as.tibble(dbGetQuery(con, "SELECT deal, fico FROM y6"))

con is happy to be wrapped in a function

con_fnc <- function(d,h) {
    dbConnect(drv, username="root", password="", dbname = d, host = h)
}
con <- con_fnc("dlf","localhost")
res <- as.tibble(dbGetQuery(con, "SELECT deal, fico FROM y6"))

With the same output as the hardwired line above

 res
# A tibble: 124,645 x 2
   deal          fico
   <chr>        <dbl>
 1 LBMLT 2006-1   509
 2 LBMLT 2006-1   698
 3 LBMLT 2006-1   584
 4 LBMLT 2006-1   788
 5 LBMLT 2006-1   585
 6 LBMLT 2006-1   542
 7 LBMLT 2006-1   526
 8 LBMLT 2006-1   743
 9 LBMLT 2006-1   692
10 LBMLT 2006-1   583
# ... with 124,635 more rows

Is that the problem you're trying to solve?


#3

Sorry, maybe my question was not properly formulated.

What I want to do is to write SQL-scripts in SQL-files in RStudio (with proper formatting) and when I execute the queries (in the SQL-script in Rstudio) save the tables generated as data.frames in R without having to copy the SQL-code to dbGetQuery.

One way to work with R-scripts och SQL-scripts separated, I found, is to save the SQL-query to a file, my-sql.sql and then run this in the R-script:

df <- dbGetQuery(con, statement = readr::read_file('my-sql.sql'))

However, this is not exactly what I want to do.


#4

Something along these lines?

library(tidyverse)
library(DBI)
query1 <- "SELECT deal, fico FROM y6"
drv <- dbDriver("MySQL")
con <- dbConnect(drv, username="root", password="", dbname ="dlf", host="localhost")
res <- as.tibble(dbGetQuery(con, query1))
 str(res)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	124645 obs. of  2 variables:
 $ deal: chr  "LBMLT 2006-1" "LBMLT 2006-1" "LBMLT 2006-1" "LBMLT 2006-1" ...
 $ fico: num  509 698 584 788 585 542 526 743 692 583 ..