Thanks, I will take a look at this. I am also thinking I might try:
- Write a new table “table_tmp”
- Once new table is written, remove old table
- Use ALTER to rename and effectively replace the old table.
That sounds like it should cut down on downtime considerably (assuming ALTER is a quick operation). For example:
#' Refreshing a table in a database
#'
#' The goal is to replace a table with new data with as little downtime as
#' possible. The method below does this by writing new data to a table with
#' '_tmp' suffixed to the tables name, keeping the old data intact. Once the
#' '_tmp' table has been written, the old table is removed and the new table
#' has '_tmp' removed. Using ALTER to rename the table is quick and not an
#' expensive operation so there is very little downtime with this approach.
library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
# connect
con <- DBI::dbConnect(odbc::odbc(), "r4fun")
# write old data
DBI::dbWriteTable(con, "flights", nycflights13::flights)
# write new data with "_tmp" suffix
DBI::dbWriteTable(con, "flights_tmp", nycflights13::flights)
# remove old data
DBI::dbRemoveTable(con, "flights")
# rename new data by remove "_tmp"
DBI::dbExecute(con, "ALTER TABLE flights_tmp RENAME TO flights;")
Anyway, realizing this is getting a bit out of scope for R related questions so I’ll likely raise follow up questions elsewhere. Really appreciate the responses, thank you.