Issue fetch data from SQL using RODBC (memory issue)

Hi,

I'm trying to fetch some data from SQL into R using the following code:

library(RODBC)

conn3 <- RODBC::odbcDriverConnect( 'driver={SQL Server};server=SERVERNAME;database=DB_NAME;trusted_connection=TRUE' )

my_data <- RODBC::sqlFetch(channel = conn3 ,sqtable = "Table name" ,stringsAsFactors = FALSE ,na.strings = "")

The data set is 1,032,576 KB and so unsurprisingly when I try to fetch from SQL to R I get an error message saying the memory has been exceeded.

Is there an R package that pulls data in from SQL and stores it in a way that means that the memory isn't exceeded?

The general best practice is to not pull unaggregated data from databases into R immediately, but rather to use R to direct the SQL engine to process the data in the DB, and pull the relevant results. The recommended packages for that are

library(odbc)
library(dplyr)
library(dbplyr)

You can read more at db.rstudio.com, and specifically https://db.rstudio.com/dplyr/

2 Likes

Hi,

Thanks for responding. I thought this might be best practice but I want to keep my data in tidy format as much as possible (e.g. one token per row) for various reasons.

Is the way you've suggested the only way? If so I'll have a look at your recommended libraries.

By using dbplyr your data will remain in a tidy format in the sql server but you will be able to manipulate it using dplyr commands from your R session, only fetching results when needed. You can even make plots and models processing the data in the sql server and only fetching results.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.