When using the dbplyr package to talk to a sqlite database found here:
It mentions that "All dplyr calls are evaluated lazily, generating SQL that is only sent to the database when you request the data."
So within the example that they gave available :
# lazily generates query summary <- mtcars2 %>% group_by(cyl) %>% summarise(mpg = mean(mpg, na.rm = TRUE)) %>% arrange(desc(mpg)) # see query summary %>% show_query() #> <SQL> #> SELECT `cyl`, AVG(`mpg`) AS `mpg` #> FROM `mtcars` #> GROUP BY `cyl` #> ORDER BY `mpg` DESC # execute query and retrieve results summary %>% collect() #> # A tibble: 3 x 2 #> cyl mpg #> <dbl> <dbl> #> 1 4 26.7 #> 2 6 19.7 #> 3 8 15.1
I understand that the process is 1) summary object is the query() and 2) is only executed when you call the
collect() function because it works on lazy execution.
what i would like to know is - is it possible to not write a string of queries and then collect them at the end for example?
# lazily generates query summary <- mtcars2 %>% group_by(cyl) %>% summarise(mpg = mean(mpg, na.rm = TRUE)) %>% arrange(desc(mpg)) summary_2 <- summary %>% nrow() # execute query and retrieve results summary_2 %>% collect()
Is this possible?
Or is the only option to write code - collect the code - send that table back to the database - perform another dplyr query - collect that and write that back to the database ?
Is the above the only process?