Parallel / Multicore processing - Exporting connection(s) to clusters Would anyone happen to know the solution to this problem ?

When trying to export a connection I get error: Error in connection_info(dbObj@ptr) : external pointer is not valid

Here is my code:



cl <- makeCluster(detectCores(logical = FALSE))
clusterEvalQ(cl=cl, {require(tidyverse)
                    require(parallel)
                    require(DBI)
                    require(odbc)
                    con <- dbConnect(odbc::odbc(), 
                     Driver = "Cloudera ODBC Driver for Impala", 
                     Host = "server", Port = 1234)
                    })

clusterExport(cl=cl,"QueryList")
clusterExport(cl=cl,"con")
              # ,envir = environment()
# clusterEvalQ(cl=cl,QueryList)
# clusterEvalQ(cl=cl,con)



QryResults <- parLapply(cl=cl,QueryList,dbGetQuery,conn=con)


#stopCluster(cl=cl)
stopCluster(cl=cl); print("Cluster stopped.")

I tried both clusterEvalQ and clusterExport as above and got the same result.

Would anyone happen to know the solution to this problem?

Parallel/ Multicore processing is a powerful way to speed up the processing capabilities of running your analysis . Specially when there are aspects of your analysis that can be executed in tandem but run independently of each other to later then be brought together. A clear example of this is a 36 union all #SQL #HQL query taking 10min instead of 30min ( ~ Time / # of Cores). On a Mac it is simple because of it’s capabilities to fork ( split instances of your loaded packages and connections). On a Windows machine it is far more challenging

Any thoughts on what would be the next step?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.