Function with input in SQL command

I have a code as below:

res <- dbSendQuery(wrds, "SELECT secid, date, best_bid, best_offer, exdate, last_date, cp_flag, strike_price, 
									volume, open_interest, impl_volatility, delta, cfadj, ss_flag
                  FROM  optionm.opprcd2011
                   WHERE date BETWEEN '2011-01-01' AND '2011-12-31' AND
                   last_date IS NOT NULL AND (date - last_date) < 30 AND
                   best_offer > best_bid AND volume >0 AND
                   (best_offer - best_bid)<=0.5*(best_offer+best_bid) AND
                   exdate-date<=365 AND
                   open_interest >0 AND
                   impl_volatility IS NOT NULL AND
                   delta IS NOT NULL")


optionm_opprcd2011 <- dbFetch(res, n = -1)

The code above is an example for year 2011.

I need to repeat this process for years 1997, 1998..., 2018. I would like to write a function with an input of the year, and use lapply to apply the function to 1997:2018. The issue is that the input is embedded in the SQL command in the function, so I am not sure how to do it.

glue looks like a good fit for this.

1 Like

You'll find example on how to build query from R with arguments
http://db.rstudio.com/best-practices/run-queries-safely/

As @martin.R said, glue :package: has a glue_sql function that would help here. Either for parametrized query if your db supports it, or for using variable and glue mechanism.

3 Likes

As long as you don't have a different table for each year (ie, optionm.opprcd2011, optionm.opprcd2012, etc), you may find RODBCext to have some added convenience. You can feed sqlExecute a list of vectors with the start and end dates, and it will do the work of looping over the arguments and binding the results into a single data frame for you. If there's a DBI form of this approach, I don't know what it is.

library(RODBCext)

sqlExecute(
  channel,
  "SELECT secid, date, best_bid, best_offer, exdate, last_date, cp_flag, strike_price, 
									volume, open_interest, impl_volatility, delta, cfadj, ss_flag
  FROM  optionm.opprcd2011
  WHERE date BETWEEN ? AND ? AND
  last_date IS NOT NULL AND (date - last_date) < 30 AND
  best_offer > best_bid AND volume >0 AND
  (best_offer - best_bid)<=0.5*(best_offer+best_bid) AND
  exdate-date<=365 AND
  open_interest >0 AND
  impl_volatility IS NOT NULL AND
  delta IS NOT NULL",
  data = list(start_date = paste0(1997:2018, "-01-01"),
              end_date = paste0(1997:2018, "-12-31")),
  fetch = TRUE,
  stringsAsFactors = FALSE
)

If each year is stored in a different table, you're probably going to have to write your own loop/apply call.

1 Like