I am having some problems with using Rcpp (and Rcpp Armadillo) with the parallel package. The project is a bit complex and I can't get around to making a reprex yet but I want to explain the nature of the problem here first to get possible insights:
I have a function say function A that calls function B. Function B creates some clusters using makeCluster in the parallel package depending on the number of cores specified. Function B then parses the cluster and my data to function C. Function C then splits my data into equal parts and then applies function D on each part of the data using parLapply.
Now function D is written in Rccp with Rccp Armadillo since I was trying to speed up computation. I have another version of function D written in R but very slow, however, I use it to test the correctness of the one written in Rcpp Armadillo.
Now if I run my project (with function D implemented in Rcpp Armadillo), using 4 cores, the first quarter of the results I get is correct (matches the correct result of the function D written in R) while the last 3 quarter is wrong. If I use 2 cores, the first half of the result is correct, and if I use 1 core, all the results are correct. This behaviour makes me think that the Rcpp implementation of function D is correct, but it seems there some possible variable corruption across the clusters/workers, hence why some parts of the results are correct and why all of the results is correct if I use just one core. Has anybody experienced this? Any insights?
Some few notes:
- I use sourceCpp to load necessary Rcpp functions.
- I experienced this behaviour using both PSOCK and FORK cluster type. For PSOCK, I used clusterEvalQ to source the cpp functions to make them available to the workers while for FORK, I just used sourceCpp to load the cpp functions in the global environment since FORK clusters have access to the global environment (I think).
Again, I know my description is very vague at this point but I just wanted to get insights from experienced people on what might be the cause of this while trying to wrap my head around creating a reprex since the project is very complex.
Thanks a lot.