Inserting SQL commands when using the dplyr syntax

I like the database queries using the dplyr syntax, but the translation is not all implemented for all the verbs.

For example: based on Database Queries With R

flights_db %>% 
  filter(str_detect(tailnum, "^N[0-9][A-Z]")) %>% 
  select(year:day, dep_delay, arr_delay)

Error: str_detect() is not available in this SQL variant

Is it possible to insert in this dplyr chain an SQL string? In this case to solve for the str_detect().

Note: I am aware of the glue_sql(), previously I was using this option, but lately I started to use the dplyr for writing queries.

You can use the SQL command applicable to your version of SQL in place of str_detect(), e.g. REGEXP().

@martin.R That's great. Thank you.

Just I have to figure out the right syntax.

flights_db %>% 
  filter(tailnum REGEXP '^N[0-9][A-Z]') 

Error: unexpected symbol in:
"flights_db %>% 
  filter(tailnum REGEXP"

Can you share some example?

Try this:

flights_db %>% 
  filter(tailnum == REGEXP ('^N[0-9][A-Z]')) 

If that isn't recognised, check your version of SQL for a similar regex expression.

Most of the SQL dialects (which I am familiar with) are using a syntax similar to WHERE fieldname REGEXP 'pattern'. For this reason the suggested solution most probably won’t work.

flights_db %>% 
  filter(tailnum == REGEXP ('^N[0-9][A-Z]')) %>% 
  show_query()

<SQL>
SELECT *
FROM `flights`
WHERE (`tailnum` = REGEXP('^N[0-9][A-Z]'))

But your suggestion put me back on the right path and I found a solution. Thank you.


tailnum_regex <- glue::glue_sql("
tailnum REGEXP '^N[0-9][A-Z]'
")

flights_db %>% 
  filter(tailnum_regex) %>% 
  show_query()

<SQL>
SELECT *
FROM `flights`
WHERE (tailnum REGEXP '^N[0-9][A-Z]')

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.