Why does RStudio documentation recommend odbc vs jdbc drivers?

databases
driver
jdbc
odbc

#1

Having been researching interfaces with databases a lot recently (and with the new v1.1 RStudio release it’s easier than ever!) I’ve been reading a fair amount of documentation and best practices.

Older redshift documentation suggests using jdbc, while newer RStudio documentation suggests odbc.

Is there a specific reason for this? More secure, faster, better supported? or is it just a little more practical i.e. are the jdbc drivers licensed/controlled etc?


#2

My understanding is that odbc is DBI-compliant and jdbc is not. There are also supposedly performance advantages. However, the RStudio people will be able to provide a proper explanation.


#3

Generally, any R package that uses java is a pain to set up.


#4

Fair enough. Have found that myself using a few things. Is it fair to see that odbc is the “preferred” connection method then? asking for a friend/the rest of my analytics department


#5

I’d say the order of preference is roughly:

  • Native database driver implemented in database package (e.g. RPostgresSQL)
  • ODBC as implemented in odbc package
  • JDBC as implemented in rjdbc package

But if you’re connecting to a bunch of databases that all have ODBC drivers, you might prefer the increased consistency of using odbc for everything.


Dplyr::tbl not working with HIVE
#6

Fantastic, thank you.


#7

I just added an article about this on the site: https://db.rstudio.com/best-practices/select-interface/