DBI vs. dplyr/dbplyr for creating SQL databases/tables

Hello RStudio Community,

Aim/ Desired Behavior

I am creating my first SQL database and am very new to SQL. The input is many CSVs that will be appended to one another within an SQL table. I would like to create/append to the SQL database/tables via R, since I will have to append additional CSVs at later timepoints. I would like to then be able to use the database with dplyr verbs (https://db.rstudio.com/dplyr/).

Question 1: DBI or dplyr/dbplyr

I see I can create the database with either DBI or dplyr/dbplyr and am trying to figure out which is preferable. I wasn't able to find a recommendation on https://db.rstudio.com/. I did find a comment from Hadley Wickam suggesting to use DBI, but this is from 2017 and may be outdated: https://github.com/tidyverse/dplyr/issues/3120#issuecomment-339034612
Any suggestions or links to resources are much appreciated!
I poked around with both packages and am including the reprex. (Note: The reprex uses a SQLite database; however, I will use PostGreSQL.)

Question 2: "write" vs. "create" and "insert/append"

DBI or dplyr/dbplyr documentation differ in their recommendations for these methods, plus I couldn't get some of the functions to work. What is the current recommended practice?

DBI

New code should prefer dbCreateTable() and dbAppendTable(). (https://dbi.r-dbi.org/reference/dbwritetable)

dplyr

db_create_table() and db_insert_into() have been deprecated in favour of db_write_table(). (https://dplyr.tidyverse.org/reference/backend_dbplyr.html)

Thank you for your thoughts and advice!

library(DBI)
library(dbplyr)
library(dplyr)

# Create db with DBI ------------------------------------------------------
con_dbi <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")

# Add table to database
DBI::dbWriteTable(con_dbi, name = "mtcars1", value = mtcars)

# Doesn't work: empty table
# DBI::dbCreateTable(con_dbi, name = "mtcars2", fields = mtcars)

# Add rows to table in database
DBI::dbWriteTable(con_dbi, name = "mtcars1", value = mtcars, append = TRUE)
DBI::dbAppendTable(con_dbi, name = "mtcars1", value = mtcars)
#> [1] 32

# Create db with dplyr ----------------------------------------------------
con_dplyr <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")

# Add table to database
dplyr::copy_to(con_dplyr, mtcars, "mtcars1", temporary = FALSE)
dplyr::db_write_table(con_dplyr, table = "mtcars2", types = db_data_type(con_dplyr, mtcars), values = mtcars)
#> [1] "mtcars2"

# Doesn't work: empty table
# dplyr::db_create_table(con_dplyr, table = "mtcars3", types = db_data_type(con_dplyr, mtcars))


# Add rows to table in database
dplyr::db_insert_into(con_dplyr, table = "mtcars2", values = mtcars)

# Doesn't work: https://github.com/tidyverse/dplyr/issues/3120
# dplyr::db_write_table(con_dplyr, table = "mtcars2", types = db_data_type(con_dplyr, mtcars), values = mtcars, append = TRUE)

# Explore table -----------------------------------------------------------
db <- con_dbi
tab <- "mtcars1"

mtcars_db <- tbl(db, tab)
mtcars_query <- mtcars_db %>% select(cyl)
mtcars_query %>% show_query()
#> <SQL>
#> SELECT `cyl`
#> FROM `mtcars1`
mtcars_query %>% collect()
#> # A tibble: 96 x 1
#>      cyl
#>    <dbl>
#>  1     6
#>  2     6
#>  3     4
#>  4     6
#>  5     8
#>  6     6
#>  7     8
#>  8     4
#>  9     4
#> 10     6
#> # … with 86 more rows

Created on 2020-10-29 by the reprex package (v0.3.0)

4 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.