Can't get sas data using dbplyr, odbc

I am able to connect to a sas odbc connection using an odbc named data source on windows folder (containing iris.sas7bdat), but can not access data.

# packages 
library(dbplyr)
library(odbc)
library(dplyr)

# connecting works
# odbc connection was created with windows tools, and named 'sas_iris_9'
con <- dbConnect(odbc::odbc(), "sas_iris_9")

# after connection contents of con:
# <OdbcConnection> __5001
# SAS Version: 9.04.01M6P110718

# functions dbGetInfo(), dbListTables(), dbExistsTable(), dbListFields() all work 

# BUT anything involving actual data movement fails, such as following command to get iris table:
iris_tbl <- tbl(con, sql("select * from IRIS"))

# error
# Error: nanodbc/nanodbc.cpp:1611: 
#   <SQL> 'SELECT *
# FROM (select * from IRIS)  zzz6 
# WHERE (0 = 1)'

# same error, same c++ file, same line, with dbGetQuery() 
dbGetQuery(con, "select * from iris") 
# Error: nanodbc/nanodbc.cpp:1611: 
# <SQL> 'select * from iris'

# the return from dbGetInfo(con) follows:

$dbname [1] "" 
$dbms.name [1] "SAS" 
$db.version [1] "9.04.01M6P110718" 
$username [1] "" 
$host [1] "" 
$port [1] "" 
$sourcename [1] "sas_iris_9" 
$servername [1] "__5001" 
$drivername [1] "SASDRV32.DLL" 
$odbc.version [1] "03.80.0000" 
$driver.version [1] "09.04.0000" 
$odbcdriver.version [1] "03.52" 
$supports.transactions [1] FALSE 
$getdata.extensions.any_column [1] TRUE 
$getdata.extensions.any_order [1] TRUE attr(,"class") [1] "SAS" "driver_info" "list"

# package, R versions
os       Windows 10 x64
dplyr * 0.8.3 2019-07-04 [1] CRAN (R 3.6.1)
dbplyr * 1.4.2 2019-06-17 [1] CRAN (R 3.6.1)
odbc        * 1.2.1   2019-12-05 [1] CRAN (R 3.6.1)

as aside, I frequently access sas, via odbc, through other platforms, c#, python, excel, access, etc.

I use the haven package to read sas7bdat files. This might be a much simpler solution rather than trying to connect as a database.

library(haven)
path <- system.file("examples", "iris.sas7bdat", package = "haven")
test.read.sas <- read_sas(path)
test.read.sas
#> # A tibble: 150 x 5
#>    Sepal_Length Sepal_Width Petal_Length Petal_Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # ... with 140 more rows

Created on 2019-12-23 by the reprex package (v0.3.0)

1 Like

Thanks, I use that for smaller datasets, but I have some larger sas datasets I'd like to access using dbplyr, that wouldn't fit into memory.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

I've made some progress connecting to SAS via ODBC, with several R packages.

library(RODBCDBI)
library(odbc)
library(DBI)

The following 2 connections, using different packages, both work:

rodbc_con <- dbConnect(RODBCDBI::ODBC(), dsn = "my_sas_odbc_dsn", believeNRows=FALSE, colQuote=NULL)
odbc_con <- dbConnect(odbc(), "my_sas_odbc_dsn", believeNRows=FALSE, colQuote = NULL)

After both of these, functions like dbGetInfo() and dbListTables() work.

But while the function dbGetQuery() works with the RODBCDBI::ODBC() connection it does not work with the odbc() connection.

the following results in an error

iris_from_sas <- dbGetQuery(odbc_con, "select * from sasiris.iris")

error

Error: nanodbc/nanodbc.cpp:1611:
'select * from sasiris.iris'

traceback()
9: stop(list(message = "nanodbc/nanodbc.cpp:1611: \n 'select * from sasiris.iris'",
call = NULL, cppstack = NULL))
8: new_result(connection@ptr, statement, immediate)
7: OdbcResult(connection = conn, statement = statement, params = params,
immediate = immediate)
6: .local(conn, statement, ...)
5: dbSendQuery(conn, statement, params = params, ...)
4: dbSendQuery(conn, statement, params = params, ...)
3: .local(conn, statement, ...)
2: dbGetQuery(odbc_con, "select * from sasiris.iris")
1: dbGetQuery(odbc_con, "select * from sasiris.iris")

Another odd thing happens. After the call to connect via odbc(), the Connections pane shows a number of SAS schemas (libraries in SAS), with hierarchy icons on the right. Clicking the arrow to the left of a schema name will expand and show a list of tables, and clicking on a table the same way, will show a list of columns, but clicking on the data view icon results in RStudio opening a small dialog saying: R code execution error

I understand this is not a reprex - but it is fairly easy, if you have SAS and ODBC driver, to create an odbc dsn connection to a folder where the iris.sas7bdat is located, and reproduce these results.

Also, I don't see SAS as a supported database in the RStudio database pages. Is there any plan to support SAS in this way? I frequently use python to access SAS, using the pyodbc library, and do not encounter the types of problems described here.