We are having massive performance issues when querying our Oracle DB from Rstudio (Pro) using ROracle package.
Basically whenever you run a query:
It works (which is a good start)
But the session is lagging a lot: you can't even open a file from the browser until the query is finished, and don't even think of running 2 sessions at the same time because if you do then both of them just freeze completely
Any idea how this could be improved or any other package anyone could suggest to fetch from Oracle?
I haven't used Oracle in 12 years, but what you're describing does not sound like an Oracle issue. This sounds like an issue with the reality that R is single threaded. This means that while R is busy doing one thing, you can't do something else. R Studio waits on R before becoming responsive.
One way around this is to run your ROracle scripts as jobs in R Studio. However you should keep in mind that the way this works is it spawns a new R worker and runs your code in there in order to keep your main R session responsive. So objects from the job are not copied over to your current workspace. You will need to have your job write any results out to a file and then import those results in another R session in order to do things with them.
Thanks a lot!
This sounds like a very good workaround! Too bad we don't have RStudio 1.2 yet, so no "Jobs" feature which would have made that solution much easier to use on a daily basis...
I guess we'll need to run these through crontab until we get upgraded, unless you'd know of another way? (I know I'm pushing my luck here )
well.... If you have long running queries that you run regularly, I really like setting up ETL processes that extract subsets of aggregated data and shove those into a database table at a regular interval (i.e. nightly). For scheduling nightly processes I like Airflow. But cron totally works. Airflow adds some logging, email notification of job status, dependency DAG (don't run this before that completes) and a web GUI, which can be nice.
Thanks for that. I will have a look at both other solutions.
Meanwhile, I was thinking, is there a way to assign a different thread to each RStudio session?
Because, it seems like our issue is that: if I work on 2 sessions, they actually use the same thread which explains why running an SQL query on session 1 impacts session 2.
And if we could modify something in the RStudio setup (if possible) then maybe we could get a single thread for each session even if they come from the same user?
Sorry for the confusion.
Yes we do have multiple sessions but they all run on the same thread.
So I can open 2 sessions at the same time on RStudio, but if I run a SQL query on 1 session it freezes both sessions.
That's why I'm wondering if there is something in the server's options that would allow running multiple sessions on multiple threads? (so 1 single thread for each session)
you likely should repost this specific threading question under the RStudio category. I'm suspicious the RStudio folks who know the answer may not read the ROracle questions. Or you could ping your account rep.