Controlling names of temporary dbplyr tables?

dbplyr creates temporary tables when copy = TRUE is coded when joining tables together with inner_join or similar dplyr functions using a local tibble as the "right" table. A message like this is shown in the console:

Created a temporary table named: ##dbplyr_007

How can I control what these temporary table names are? When running multiple sessions processing different data I see error messages that a particular temporary table already exists. I don't think I would have that problem if I could make sure the tables have different names in the different sessions. A high-precision time stamp as part of the temporary name would help.

1 Like

Interesting !
I'm thinking from having searched the codebase, that it might be backend dependent ?


# Temporary tables --------------------------------------------------------
# SQL server does not support CREATE TEMPORARY TABLE and instead prefixes
# temporary table names with #

mssql_temp_name <- function(name, temporary){
  # check that name has prefixed '##' if temporary
  if (temporary && substr(name, 1, 1) != "#") {
    name <- paste0("##", name)
    message("Created a temporary table named: ", name)
  }
  name
}

So I would experiment with possibly a different backend, or try and see what happens if I made a custom mssql_temp_name function, that added in number from datetime as per your idea. Finally, I might give up and log an issue on the github.

Hope this can be of some help to you.

I agree with you.
Perhaps raise an issue with the package owners , this might prompt them to improve their product :slight_smile:

This looks like a simple error — we should be using #dbplyr_007 to create a session-local temporary table instead of ##dbplyr_007 which creates a global temporary table. Please feel free to file an issue at https://github.com/tidyverse/dbplyr/issues.

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Thanks for your reply and the research about how the temporary table is named. But I can't just move billions of records to experiment with a different backend.

I'm connecting to an MS SQL database in Azure.

Part of a convenient dplyr expression includes "copy = TRUE" to use a local tibble as a temporary table in a database join to force a set intersection to filter the data:

inner_join(targetXTypes, by = "X_TYPE_ID",
           copy = TRUE) %>%

The "copy = TRUE" causes a temporary table to be created with the message shown above. But when multiple concurrent R sessions are run using the same code on different data with the same database (I believe) there's a race condition and the first session gets to create a temporary table, but the second session errors out with a duplicate temporary name.

I can do the following (which is a bit longer) to avoid the problem:

tempName <- paste0("TargetX_", format(Sys.time(), "%Y%m%d_%H%M%S"))
targetX <- copy_to(databaseConnection,
                   targetXTypes,
                   tempName)

But "copy = TRUE" should be doing something similar to avoid duplicate temporary table names.