When trying to export a connection I get error: Error in connection_info(dbObj@ptr) : external pointer is not valid
Here is my code:
cl <- makeCluster(detectCores(logical = FALSE))
clusterEvalQ(cl=cl, {require(tidyverse)
require(parallel)
require(DBI)
require(odbc)
con <- dbConnect(odbc::odbc(),
Driver = "Cloudera ODBC Driver for Impala",
Host = "server", Port = 1234)
})
clusterExport(cl=cl,"QueryList")
clusterExport(cl=cl,"con")
# ,envir = environment()
# clusterEvalQ(cl=cl,QueryList)
# clusterEvalQ(cl=cl,con)
QryResults <- parLapply(cl=cl,QueryList,dbGetQuery,conn=con)
#stopCluster(cl=cl)
stopCluster(cl=cl); print("Cluster stopped.")
I tried both clusterEvalQ and clusterExport as above and got the same result.
Would anyone happen to know the solution to this problem?
Parallel/ Multicore processing is a powerful way to speed up the processing capabilities of running your analysis . Specially when there are aspects of your analysis that can be executed in tandem but run independently of each other to later then be brought together. A clear example of this is a 36 union all #SQL #HQL query taking 10min instead of 30min ( ~ Time / # of Cores). On a Mac it is simple because of it’s capabilities to fork ( split instances of your loaded packages and connections). On a Windows machine it is far more challenging