concatenate a variable using glue (or other means) within a sql script

I have an r script and a sql script:

library(tidyverse)
library(dbplyr)
library(DBI)
library(glue)
iris_db <- tbl_memdb(iris)
con <- src_memdb()$con

var1 <- "setosa"
query <- read_lines("example.sql") %>% glue_collapse(sep = "\n") %>% glue_sql(.con = con)
setosa <- dbGetQuery(con, query)
setosa %>% glimpse()

Observations: 50
Variables: 5
$ Sepal.Length <dbl> 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4, 4.8, 4.8, 4.3, 5.8, 5.7, 5.4, 5.1, 5.7, 5.1, 5.4, 5.1, …
$ Sepal.Width  <dbl> 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.4, 3.0, 3.0, 4.0, 4.4, 3.9, 3.5, 3.8, 3.8, 3.4, 3.7, …
$ Petal.Length <dbl> 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5, 1.6, 1.4, 1.1, 1.2, 1.5, 1.3, 1.4, 1.7, 1.5, 1.7, 1.5, …
$ Petal.Width  <dbl> 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2, 0.2, 0.1, 0.1, 0.2, 0.4, 0.4, 0.3, 0.3, 0.3, 0.2, 0.4, …
$ Species      <chr> "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", …

example.sql:

-- !preview conn=con

select *
from iris
where species = {var1}

So far so good.

I would like to name my field in select using a variable e.g.

-- !preview conn=con

select
  Sepal.Length,
  Species as Species || "-" {var1}
from iris
where species = {var1}

Wanted the results to look like:

Sepal.Length ....
Species-setosa ...

Running the above gives:

Error: near "||": syntax error

Tried:
Species as concat(Species, {var1}) which gives:

Error: near "(": syntax error

Tried:
Species as concat('Species', {var1})

Error: near "(": syntax error

Tried (per examples here: https://github.com/tidyverse/glue):
Species as {'Species-', var1}

Error in parse(text = text, keep.source = FALSE) :
:1:11: unexpected ','
1: 'Species-',
^

Is there a way to include the variable enclosed within crly braces in the final field name of a query?

You can use any R code within the {}, including other calls to glue() or any other function. So if I understand the question properly I would do something like this.

# Setup code
library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "iris", iris)

#Creating the SQL string
var1 <- "virginica"
library(glue)
sql <- glue_sql("
  select 'Sepal.Length', Species as {glue('Species-{var1}')}
  from iris
  where species = {var1}
", .con = con)
sql
#> <SQL>   select 'Sepal.Length', Species as 'Species-virginica'
#>   from iris
#>   where species = 'virginica'

# Verify the query works
dbGetQuery(con, sql, n = 5)
#>   'Sepal.Length' Species-virginica
#> 1   Sepal.Length         virginica
#> 2   Sepal.Length         virginica
#> 3   Sepal.Length         virginica
#> 4   Sepal.Length         virginica
#> 5   Sepal.Length         virginica

Created on 2020-02-14 by the reprex package (v0.3.0)

2 Likes

Exactly what I was looking for, thank you!

Trying this on real data and not the reprex and getting an error. I wonder if it's specific to my odbc driver which is for AWS Athena.

Does anything jump out at you here?

r script:

day_from <- 7
day_to <- 30

sql script:

select 
  i.id,
  coalesce(sn.sessions_day_from, 0) as {glue('sessions_day_{day_from}')}
from installs_base i 
left join sessions_day_from sn on sn.s = i.s 

Results in:

Error in new_result(connection@ptr, statement, immediate) :
nanodbc/nanodbc.cpp:1374: 00000: [Simba][Athena] (1040) An error has been thrown from the AWS Athena client. Athena Error No: 130, HTTP Response Code: 400, Exception Name: InvalidRequestException, Error Message: line 150:40: mismatched input ''sessions_day_7'' expecting {'ADD', 'ALL', 'SOME', 'ANY', 'AT', 'NO', 'SUBSTRING', 'POSITION', 'TINYINT', 'SMALLINT', 'INTEGER', 'DATE', 'TIME', 'TIMESTAMP', 'INTERVAL', 'YEAR', 'MONTH', 'DAY', 'HOUR', 'MINUTE', 'SECOND', 'ZONE', 'FILTER', 'OVER', 'PARTITION', 'RANGE', 'ROWS', 'PRECEDING', 'FOLLOWING', 'CURRENT', 'ROW', 'SCHEMA', 'COMMENT', 'VIEW', 'REPLACE', 'GRANT', 'REVOKE', 'PRIVILEGES', 'PUBLIC', 'OPTION', 'EXPLAIN', 'ANALYZE', 'FORMAT', 'TYPE', 'TEXT', 'GRAPHVIZ', 'LOGICAL', 'DISTRIBUTED', 'VALIDATE', 'SHOW', 'TABLES', 'VIEWS', 'SCHEMAS', 'CATALOGS', 'COLUMNS', 'COLUMN', 'USE', 'PARTITIONS', 'FUNCTIONS', 'TO', 'SYSTEM', 'BERNOULLI', 'POISSONIZED', 'TABLESAMPLE', 'ARRAY', 'MAP', 'SET', 'RESET', 'SESSION', 'DATA', 'START', 'TRANSACTI

I cannot see what's different compared with the example that you showed me. Any pointers much appreciated if anything jumps out at you?

I see the issue but am unsure how to solve it. The field name is being translated to a string. When I view the query:

prediction_query <- query <- read_lines("prediction_data_query.sql") %>% glue_collapse(sep = "\n") %>% glue_sql(.con = con)

Then if I view prediction_query:

select 
  i.id,
  coalesce(sn.sessions_day_from, 0) as 'sessions_day_7'
from installs_base i 
left join sessions_day_from sn on sn.s = i.s 

I tried removing the quotes but that results in another error. So i think that's the problem, that coalesce(sn.sessions_day_from, 0) as 'sessions_day_7' has the as part as a string.

Do you get the same error if you replace the single-quotes with double-quotes in the prediction query?

Do you mean change this:

coalesce(sn.sessions_day_from, 0) as {glue('sessions_day_{day_from}')}

To this?

coalesce(sn.sessions_day_from, 0) as {glue("sessions_day_{day_from}")}

I tried that just now but got with the same outcome