Unable to upload dataframe with odbc

von_olaf · June 16, 2018, 8:13pm

Hello there,

I am able to connect to my database using odbc, but I get a strange error when I try to upload a dataframe there. See below:

library(odbc)
library(DBI)
library(tibble)
library(dplyr)
library(readr)
con <- dbConnect(odbc::odbc(), "DV_MYDV", dbname = "mydb")

the connection works:

dbListTables(con)
[1] "version" "NewTestTable"
[3] "test1" "test2"
[5] "test3" "testtable1"

however, uploading a dataframe fails. How is it possible? I can easily create tables with SQL queries such as "CREATE TABLE etc".

dbWriteTable(con, name = "iris", value = as.data.frame(dataframe), overwrite = TRUE)
Error in new_result(connection@ptr, statement) :
nanodbc/nanodbc.cpp:1665: HY000: Table creation not allowed.

This is annoying because I need to upload large dataframes to my database. Working row by row would not be efficient.

Thanks!

edgararuiz · June 16, 2018, 9:45pm

I wouldn't recommend to use dbWriteTable() or copy_to() for large table uploads. Ideally, we would like to keep the data transformation inside the database and not in R, this can be done using dplyr: http://db.rstudio.com/overview

May I ask what operation do you need to do in R and then upload to the DB?

von_olaf · June 16, 2018, 10:03pm

hi @edgararuiz thanks for helping. no, the point is that i have access to a remote database and I need to upload a dataframe there so that I can merge with other tables there . so using writetable or others is necessary. what i don't understand is that i am able to create tables with sql statements, but somehow dbwritetable cant. any ideas? thanks!!

edgararuiz · June 16, 2018, 10:49pm

How have you tested CREATE TABLE directly against the database, using dbSendQuery()?

von_olaf · June 18, 2018, 4:38pm

actually when I try to write a table with dbsendquery, studio crashes entirely. this is weird, as i am able to run a "write table" query in python with pyodbc cursor.execute(myquery)...
any ideas?

thanks!!

edgararuiz · June 18, 2018, 5:21pm

Are you using the exact same DSN for both (python & R) connections?

von_olaf · June 18, 2018, 6:06pm

I believe so, in python I use

import pyodbc
import pandas as pd
import numpy as np

cnxn = pyodbc.connect('DSN=DV_MYDSN')
cnxn.timeout = 360000
cursor = cnxn.cursor()
SQL = """CREATE TABLE customdb.mylogin.mytable (t1 varchar(255))"""

cursor.execute(SQL)
print("Table created")

In R I run

library(odbc)
library(DBI)
library(tibble)
library(dplyr)
library(readr)

con <- dbConnect(odbc::odbc(), "DV_MYDV")

mysql ='create table customdb.mylogin.mytable'
dbSendQuery(con, mysql)

#computer explodes

edgararuiz · June 18, 2018, 6:31pm

ok, why are the two SQL statements different? should we run the exact same ones?

Also, can you confirm that the table was indeed created by the python code? I wonder if it's failing silently

von_olaf · June 18, 2018, 6:39pm

Yes, you are right. When I run the same exact query (with the schema at the end) I get

dbSendQuery(con, mysql)
Error in new_result(connection@ptr, statement) :
nanodbc/nanodbc.cpp:1665: HY000: Query already resolved. [parser-2901050]

Also, I confirm the table gets created after the python call.

edgararuiz · June 18, 2018, 6:55pm

Well, I was actually referring that in the first one there's a field being created

Can we try this?

dbSendQuery(con, "CREATE TABLE customdb.mylogin.mytable (t1 varchar(255))")

von_olaf · June 18, 2018, 6:56pm

yes, this what I just ran. I got the query already resolved error

von_olaf · June 18, 2018, 9:53pm

what does that even mean?

jlacko · June 19, 2018, 11:20am

Stupid question, but what about dropping the table before second execution of the create script? It seems you are creating the same table twice.

von_olaf · June 19, 2018, 11:36am

I tried creating a whole new one, actually. I get the same error

jlacko · June 19, 2018, 12:25pm

It was a cheap shot, I know...

I access SQL server regularly, the connection I use is as follows:

library(odbc)
con <- dbConnect(odbc(), Driver = "ODBC Driver 17 for SQL Server", 
    Server = "my-server", Database = "my-database", UID = "my-login", 
    PWD = "my-password", Port = 1433)

MS SQL Server is not my preferred RDBS for working from R (Postgres is much more fun) but it works; and the benefits of an MS ecosystem (Power BI and stuff) are unquestionable, especially in a corporate environment.