Can't get sas data using dbplyr, odbc

I am able to connect to a sas odbc connection using an odbc named data source on windows folder (containing iris.sas7bdat), but can not access data.

# packages 
library(dbplyr)
library(odbc)
library(dplyr)

# connecting works
# odbc connection was created with windows tools, and named 'sas_iris_9'
con <- dbConnect(odbc::odbc(), "sas_iris_9")

# after connection contents of con:
# <OdbcConnection> __5001
# SAS Version: 9.04.01M6P110718

# functions dbGetInfo(), dbListTables(), dbExistsTable(), dbListFields() all work 

# BUT anything involving actual data movement fails, such as following command to get iris table:
iris_tbl <- tbl(con, sql("select * from IRIS"))

# error
# Error: nanodbc/nanodbc.cpp:1611: 
#   <SQL> 'SELECT *
# FROM (select * from IRIS)  zzz6 
# WHERE (0 = 1)'

# same error, same c++ file, same line, with dbGetQuery() 
dbGetQuery(con, "select * from iris") 
# Error: nanodbc/nanodbc.cpp:1611: 
# <SQL> 'select * from iris'

# the return from dbGetInfo(con) follows:

$dbname [1] "" 
$dbms.name [1] "SAS" 
$db.version [1] "9.04.01M6P110718" 
$username [1] "" 
$host [1] "" 
$port [1] "" 
$sourcename [1] "sas_iris_9" 
$servername [1] "__5001" 
$drivername [1] "SASDRV32.DLL" 
$odbc.version [1] "03.80.0000" 
$driver.version [1] "09.04.0000" 
$odbcdriver.version [1] "03.52" 
$supports.transactions [1] FALSE 
$getdata.extensions.any_column [1] TRUE 
$getdata.extensions.any_order [1] TRUE attr(,"class") [1] "SAS" "driver_info" "list"

# package, R versions
os       Windows 10 x64
dplyr * 0.8.3 2019-07-04 [1] CRAN (R 3.6.1)
dbplyr * 1.4.2 2019-06-17 [1] CRAN (R 3.6.1)
odbc        * 1.2.1   2019-12-05 [1] CRAN (R 3.6.1)

as aside, I frequently access sas, via odbc, through other platforms, c#, python, excel, access, etc.

I use the haven package to read sas7bdat files. This might be a much simpler solution rather than trying to connect as a database.

library(haven)
path <- system.file("examples", "iris.sas7bdat", package = "haven")
test.read.sas <- read_sas(path)
test.read.sas
#> # A tibble: 150 x 5
#>    Sepal_Length Sepal_Width Petal_Length Petal_Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # ... with 140 more rows

Created on 2019-12-23 by the reprex package (v0.3.0)

1 Like

Thanks, I use that for smaller datasets, but I have some larger sas datasets I'd like to access using dbplyr, that wouldn't fit into memory.

I've made some progress connecting to SAS via ODBC, with several R packages.

library(RODBCDBI)
library(odbc)
library(DBI)

The following 2 connections, using different packages, both work:

rodbc_con <- dbConnect(RODBCDBI::ODBC(), dsn = "my_sas_odbc_dsn", believeNRows=FALSE, colQuote=NULL)
odbc_con <- dbConnect(odbc(), "my_sas_odbc_dsn", believeNRows=FALSE, colQuote = NULL)

After both of these, functions like dbGetInfo() and dbListTables() work.

But while the function dbGetQuery() works with the RODBCDBI::ODBC() connection it does not work with the odbc() connection.

the following results in an error

iris_from_sas <- dbGetQuery(odbc_con, "select * from sasiris.iris")

error

Error: nanodbc/nanodbc.cpp:1611:
'select * from sasiris.iris'

traceback()
9: stop(list(message = "nanodbc/nanodbc.cpp:1611: \n 'select * from sasiris.iris'",
call = NULL, cppstack = NULL))
8: new_result(connection@ptr, statement, immediate)
7: OdbcResult(connection = conn, statement = statement, params = params,
immediate = immediate)
6: .local(conn, statement, ...)
5: dbSendQuery(conn, statement, params = params, ...)
4: dbSendQuery(conn, statement, params = params, ...)
3: .local(conn, statement, ...)
2: dbGetQuery(odbc_con, "select * from sasiris.iris")
1: dbGetQuery(odbc_con, "select * from sasiris.iris")

Another odd thing happens. After the call to connect via odbc(), the Connections pane shows a number of SAS schemas (libraries in SAS), with hierarchy icons on the right. Clicking the arrow to the left of a schema name will expand and show a list of tables, and clicking on a table the same way, will show a list of columns, but clicking on the data view icon results in RStudio opening a small dialog saying: R code execution error

I understand this is not a reprex - but it is fairly easy, if you have SAS and ODBC driver, to create an odbc dsn connection to a folder where the iris.sas7bdat is located, and reproduce these results.

Also, I don't see SAS as a supported database in the RStudio database pages. Is there any plan to support SAS in this way? I frequently use python to access SAS, using the pyodbc library, and do not encounter the types of problems described here.

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.