The link between knitr {sql} and tidyverse::glue

glue
knitr
sql

#1

knitr has a great method of interacting with sql databases through sql chunks](http://rmarkdown.rstudio.com/authoring_knitr_engines.html#sql) in a clean way with the SQL properly formatted in the body, but still able to have “some” r parameters passed to it via ?parameter. However, it doesn’t allow the user (as far as I’ve found) to pass in vectors of values to mimic things like WHERE value IN (1,2,3,4) easily.

The glue package has great methods to deal with the situation above, but doesn’t format nicely in print and decreases readability by using sprintf and paste-ish formats and requiring the query to be in a string.

Is there a method to allow knitr chunks to accept vectors like glue? e.g.

R chunk

values <- c(1,2,3)
table <- iris

SQL chunk

SELECT *
FROM {table}
WHERE value in {values*}

#2

It should be possible to define a glue (and glue_sql) knitr engine.


#3

Thanks @jimhester, I’ll look into doing that myself (though I suspect it might be a little beyond me). The package itself is very good, thanks for your work :slight_smile:


#4

Ha, in looking at how a knitr engine works I think I’ve just found your pull request… XD


#5

I have an implementation for a glue_sql engine at https://github.com/tidyverse/glue/commit/69bc72ce83ed997953511041e6b52d0f73b7e1dc you can use. Until knitr#1468 is merged you will have to use either

{r engine = "glue_sql"}

Or

{gluesql}

To start the block. Also be sure you attach the glue package with library(glue) before trying to use the engine.


#6

Fantastic, thank you very much! I’ll try it out ASAP :slight_smile:


#7

Afraid I’m having some trouble, I’m probably missing something.

If I install the new version of glue

devtools::install_github("tidyverse/glue", ref = "69bc72ce83ed997953511041e6b52d0f73b7e1dc")

I get a successful install message DONE (glue)

In a new notebook I have libraries loaded:

library(odbc)
library(knitr)
library(glue)

I also have an open ODBC connection that will run normal {sql}, but when it comes to a chunk with the header:

```{r engine = "glue_sql"}
SELECT *
FROM table
WHERE value IN ({values*})

And alternatively

```{gluesql}
SELECT *
FROM table
WHERE value IN ({values*})

I only get the following when running the chunk

/bin/sh: glue_sql: command not found

or

/bin/sh: gluesql: command not found

respectively. What have I missed?

Sorry for the hassle.


#8

It doesn’t work in a notebook, which likely needs extra tooling. But should work fine in a regular Rmd.


#9

When I convert the yaml to output: html_document, I get this error on the first line that the {gluesql} or {r engine = "glue_sql"} chunk:

Error: All unamed arguments must be length 1 Execution halted

apologies :frowning:


#10

It would be helpful to see exactly what code you are trying to run.

Here is a reproducible example https://gist.github.com/jimhester/350e770b8e31aa1fb6cff076a8c2f169

and the output on my machine

http://rpubs.com/jimhester/glue_sql_example


#11

Fixed it! Totally a problem on my side, sorry.

TL;DR I assumed that install.github() would overwrite the cran install of the package, but it actually installed it in a secondary location, so both packages were available at different points in my .libpaths() list.

Your gist runs just as expected, and a slightly modified version runs on the Redshift cluster I am currently using.

Thank you very much for your work :slight_smile: