Dbplyr and purrr::map on MS SQL Server 12: "Invalid object name"

purrr
database
dbplyr

#1

I'm running into a problem when mapping to a list of SQL table pointers using purrr and dbplyr. Essentially what I want to do is use a glue character vector with all the table names I need, and then manipulate all the tables using purrr::map().

However there is a problem. I have no issue using dbplyr on this database with a single table, as shown in the example below. But for multiple tables I get the following error:

#> [[1]]
#> Error: <SQL> 'SELECT  TOP 10 *
#> FROM "flights_1"'
#>   nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name 'flights_1'. 

So here is an example of what I am doing on Microsoft SQL Server 12.0.5203.0.

library(tidyverse)
library(glue)
library(odbc)

# Connection and data -----------------------------------------------------

# Establish connection
con <- dbConnect(odbc::odbc(), "MS_SQL_Server_12", Database = "my_Database", 
                 UID = "my_Username", PWD = rstudioapi::askForPassword("Your password?"))

# Copy example data to connection
copy_to(con, nycflights13::flights, "flights_1",
        temporary = FALSE, 
        indexes = list(
          c("year", "month", "day"), 
          "carrier", 
          "tailnum",
          "dest"
        )
)

copy_to(con, nycflights13::flights, "flights_2",
        temporary = FALSE, 
        indexes = list(
          c("year", "month", "day"), 
          "carrier", 
          "tailnum",
          "dest"
        )
)


# The actual stuff --------------------------------------------------------


# Single table works dandy
single_table_pointer <- tbl(con, "flights_1")

# But all of the sudden multiple tables doesn't work
table_names <- glue("flights_{1:2}")

table_list <- table_names %>% 
  map(., .f = ~tbl(con, .x))

table_list
#> [[1]]
#> Error: <SQL> 'SELECT  TOP 10 *
#> FROM "flights_1"'
#>   nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name 'flights_1'. 


This is something that has worked in the past for me (thus I'm wondering if it has something to do with packages issues arising from updating to 3.5), and works using toy examples where the connection "con" is for instance.

con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")

I know this is a bit of a stretch, but if anyone here has experienced something similiar I'd love to hear from you.


#2

Hello,

Can you try with using glue::glue_sql("flights_{1:2}") to have a DBI::SQL object and not a character. Maybe it is something sensitive for a MySQL server.

I don't have one to test further. I can confirm it works with SQLite. However, I changed the tag to add database so that people with the knowledge can see your post more easily.


#3

Thanks so much for the reply! I tried glue_sql and sadly it doesn't work.

I've tried other drivers I have for Microsoft SQL Server, but that hasn't worked. I'm currently waiting on IT to update to The Microsoft ODBC Driver 17.

I couldn't help but wonder if this might have something to do with the encoding (I'm forced to have a system default locale of ISO-8859-1 and the server in question is SQL_Latin1_General_CP850_BIN). The reason I say this is that, while I'm currently unable to reproduce it, the "ÿþ" BOM has appeared twice when running this code now, at the beginning of the string AND (more mysteriously perhaps) at the end of the string. I believe it was something like this:

Error: <SQL> 'SELECT * FROM "flights_1" AS "zzz106" WHERE (0 = 1)' nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name "ÿþflights_1ÿþ".

However, I tried replicating this on an Azure SQL Database instance and while I didn't get the weird BOMs, I ran into the same "Invalid object name" problems.

Again, thanks for your reply cderv and for adding the database tag.

ps. I edited the original post to fix some very unclear wording on my part ("my_SQL server" is inconvenient, as it is an MS SQL Server, rather than a mySQL Server. Sorry about that.


#4

Ok, I just noticed something REALLY weird. The error isn't occurring when creating this list of table pointers. It occurs only when printing the table pointers as a list by referencing the list itself by name.

table_names <- glue("flights_{1:2}")

table_list <- table_names %>% 
  map(., .f = ~tbl(con, .x))

table_list[[1]]
#> # Source:   table<flights_1> [?? x 19]
#> # Database: Microsoft SQL Server
#> #   12.00.5203[dbo@MS_SQL_Server_12/my_Database]
#>     year month   day dep_time sched_dep_time dep_delay arr_time
#>     <int> <int> <int>    <int>          <int>     <dbl>    <int>
#>  1  2013     1     1      517            515         2      830
#>  2  2013     1     1      533            529         4      850
#>  3  2013     1     1      542            540         2      923
#>  4  2013     1     1      544            545        -1     1004
#>  5  2013     1     1      554            600        -6      812
#>  6  2013     1     1      554            558        -4      740
#>  7  2013     1     1      555            600        -5      913
#>  8  2013     1     1      557            600        -3      709
#>  9  2013     1     1      557            600        -3      838
#> 10  2013     1     1      558            600        -2      753
#> # ... with more rows, and 12 more variables: sched_arr_time <int>,
#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
#> #   hour <dbl>, minute <dbl>, time_hour <dttm>

So this works great, right?

EXCEPT, when I try to print it as the list that I've created, table_list, I get the error as before. And after having done that once, I can't refer to a single table pointer in the list without getting an error.

> table_list
#> [[1]]
#> Error: <SQL> 'SELECT  TOP 10 *
#> FROM "flights_1"'
#> nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name 'flights_1'. 

> table_list[[1]]
#> Error: <SQL> 'SELECT  TOP 10 *
#> FROM "flights_1"'
#> nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name 'flights_1'. 

So now, something that worked fine (referring to the element in the list), literally just a couple of seconds ago, doesn't work anymore once I've tried to print the list once.

This is the structure of the list.

> str(table_list)
List of 2
$ :List of 2
..$ src:List of 2
.. ..$ con  :Formal class 'Microsoft SQL Server' [package ".GlobalEnv"] with 4 slots
.. .. .. ..@ ptr     :<externalptr> 
  .. .. .. ..@ quote   : chr "\""
.. .. .. ..@ info    :List of 13
.. .. .. .. ..$ dbname               : chr "my_Database"
.. .. .. .. ..$ dbms.name            : chr "Microsoft SQL Server"
.. .. .. .. ..$ db.version           : chr "12.00.5203"
.. .. .. .. ..$ username             : chr "dbo"
.. .. .. .. ..$ host                 : chr ""
.. .. .. .. ..$ port                 : chr ""
.. .. .. .. ..$ sourcename           : chr "MS_SQL_Server_12"
.. .. .. .. ..$ servername           : chr "MS_SQL_Server_12\\SERVER"
.. .. .. .. ..$ drivername           : chr "msodbcsql13.dll"
.. .. .. .. ..$ odbc.version         : chr "03.80.0000"
.. .. .. .. ..$ driver.version       : chr "14.00.1000"
.. .. .. .. ..$ odbcdriver.version   : chr "03.80"
.. .. .. .. ..$ supports.transactions: logi TRUE
.. .. .. .. ..- attr(*, "class")= chr [1:3] "Microsoft SQL Server" "driver_info" "list"
.. .. .. ..@ encoding: chr ""
.. ..$ disco: NULL
.. ..- attr(*, "class")= chr [1:3] "src_dbi" "src_sql" "src"
..$ ops:List of 2
.. ..$ x   : 'ident' chr "flights_1"
.. ..$ vars: chr [1:19] "year" "month" "day" "dep_time" ...
.. ..- attr(*, "class")= chr [1:3] "op_base_remote" "op_base" "op"
..- attr(*, "class")= chr [1:4] "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl"
$ :List of 2
..$ src:List of 2
.. ..$ con  :Formal class 'Microsoft SQL Server' [package ".GlobalEnv"] with 4 slots
.. .. .. ..@ ptr     :<externalptr> 
  .. .. .. ..@ quote   : chr "\""
.. .. .. ..@ info    :List of 13
.. .. .. .. ..$ dbname               : chr "my_Database"
.. .. .. .. ..$ dbms.name            : chr "Microsoft SQL Server"
.. .. .. .. ..$ db.version           : chr "12.00.5203"
.. .. .. .. ..$ username             : chr "dbo"
.. .. .. .. ..$ host                 : chr ""
.. .. .. .. ..$ port                 : chr ""
.. .. .. .. ..$ sourcename           : chr "MS_SQL_Server_12"
.. .. .. .. ..$ servername           : chr "MS_SQL_Server_12\\SERVER"
.. .. .. .. ..$ drivername           : chr "msodbcsql13.dll"
.. .. .. .. ..$ odbc.version         : chr "03.80.0000"
.. .. .. .. ..$ driver.version       : chr "14.00.1000"
.. .. .. .. ..$ odbcdriver.version   : chr "03.80"
.. .. .. .. ..$ supports.transactions: logi TRUE
.. .. .. .. ..- attr(*, "class")= chr [1:3] "Microsoft SQL Server" "driver_info" "list"
.. .. .. ..@ encoding: chr ""
.. ..$ disco: NULL
.. ..- attr(*, "class")= chr [1:3] "src_dbi" "src_sql" "src"
..$ ops:List of 2
.. ..$ x   : 'ident' chr "flights_2"
.. ..$ vars: chr [1:19] "year" "month" "day" "dep_time" ...
.. ..- attr(*, "class")= chr [1:3] "op_base_remote" "op_base" "op"
..- attr(*, "class")= chr [1:4] "tbl_dbi" "tbl_sql" "tbl_lazy" "tbl"

But then when I try to print the list, as mentioned, the error comes up and after that I can't even refer to a single pointer in said list. However the structure of the list is 100% unchanged.

This feels veeeeeery strange to me.


#5

Hi, what happens if you run tbl(con, "flights_1"), without assigning it to a variable? It should not work either, and if it works, can you also run show_query(tbl(con, "flights_1"))


#6

Thanks so much for replying Edgar.

Without assigning it to a variable also works for single table pointer.

#This works fine
single_table_pointer <- tbl(con, "flights_1")

#This also works fine
tbl(con, "flights_1")
#> # Source:   table<flights_1> [?? x 19]
#> # Database: Microsoft SQL Server 12.00.5203[dbo@MS_SQL_Server_12\SERVER/my_Database]
#> year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin dest 
#> <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>     <dbl> <chr>    <int> <chr>   <chr>  <chr>
#>  1  2013     1     1      517            515         2      830            819        11 UA        1545 N14228  EWR    IAH  
#>  2  2013     1     1      533            529         4      850            830        20 UA        1714 N24211  LGA    IAH  
#>  3  2013     1     1      542            540         2      923            850        33 AA        1141 N619AA  JFK    MIA  
#>  4  2013     1     1      544            545        -1     1004           1022       -18 B6         725 N804JB  JFK    BQN  
#>  5  2013     1     1      554            600        -6      812            837       -25 DL         461 N668DN  LGA    ATL  
#>  6  2013     1     1      554            558        -4      740            728        12 UA        1696 N39463  EWR    ORD  
#>  7  2013     1     1      555            600        -5      913            854        19 B6         507 N516JB  EWR    FLL  
#>  8  2013     1     1      557            600        -3      709            723       -14 EV        5708 N829AS  LGA    IAD  
#>  9  2013     1     1      557            600        -3      838            846        -8 B6          79 N593JB  JFK    MCO  
#> 10  2013     1     1      558            600        -2      753            745         8 AA         301 N3ALAA  LGA    ORD  
#> # ... with more rows, and 5 more variables: air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>

# Using show_query also works fine
show_query(tbl(con, "flights_1"))
#> <SQL>
#>   SELECT *
#>   FROM "flights_1"

But using show_query in a purrr pipeline seems to show the queries correctly and then it's as if it attempts to print the first element of the list (without being asked to), and everything goes wrong.

table_names %>% 
  map(~tbl(con, .x)) %>% 
  map(~show_query(.))
#> <SQL>
#>   SELECT *
#>   FROM "flights_1"
#> <SQL>
#>   SELECT *
#>   FROM "flights_2"
#> [[1]]
#> Error: <SQL> 'SELECT  TOP 10 *
#> FROM "flights_1"'
#> nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Invalid object name 'flights_1'.

#7

Also, I don't know if this is of interest but when I have "working" pointers in the list, I can't collect() from them.

table_list[[1]]
#> # Source:   table<flights_1> [?? x 19]
#> # Database: Microsoft SQL Server 12.00.5203[dbo@MS_SQL_Server_12\SERVER/my_Database]
#> year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier
#> <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>     <dbl> <chr>  
#>  1  2013     1     1      517            515         2      830            819        11 UA     
#>  2  2013     1     1      533            529         4      850            830        20 UA     
#>  3  2013     1     1      542            540         2      923            850        33 AA     
#>  4  2013     1     1      544            545        -1     1004           1022       -18 B6     
#>  5  2013     1     1      554            600        -6      812            837       -25 DL     
#>  6  2013     1     1      554            558        -4      740            728        12 UA     
#>  7  2013     1     1      555            600        -5      913            854        19 B6     
#>  8  2013     1     1      557            600        -3      709            723       -14 EV     
#>  9  2013     1     1      557            600        -3      838            846        -8 B6     
#> 10  2013     1     1      558            600        -2      753            745         8 AA     
#> # ... with more rows, and 9 more variables: flight <int>, tailnum <chr>, origin <chr>, dest <chr>,
#> #   air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>

flights_1_locally <- collect(con, table_list[[1]])
# > Error in UseMethod("collect") : 
# >   no applicable method for 'collect' applied to an object of class "c('Microsoft SQL Server', 'OdbcConnection', 'DBIConnection', 'DBIObject')"

flights_1_locally <- dbplyr::db_collect(con, table_list[[1]])
#> Error in (function (classes, fdef, mtable)  : 
#>             unable to find an inherited method for function ‘dbSendQuery’ for signature ‘"Microsoft SQL Server", "tbl_dbi"’
``

#8

collect does not work on a connection but as any {dplyr} verbs. You collect at the end of a pipe. collect(table_list[[1]]) should get you a result. See the help page


#9

Whoops, you're absolutely right. With me running in circles trying to test everything, it seems I've had a rather silly brainfart. :sweat_smile: