Unable to disable multithreading in R

Hi all,

I've searched the web extensively and have tried to troubleshoot this for a few months, but I'm at my wits end!

In a variety of R packages that have an option for parallelization, I am unable to disable parallel core usage (or set the number of cores to be used).

I first noticed this when running the DESeq2 package, which uses the BiocParallel package for parallel computing (post here). I would see a large number of my cores engaged despite registering my serial backend. Even when I explicitly registered a parallel backend and set the number of workers to 1 or 2, this is ignored and many cores get engaged.

Recently, I have been using the caret package for machine learning and ran into this problem again.
As a minimal example, I ran this in a fresh R session and it engages 65 cores.

library(caret)

random_matrix <- matrix(rnorm(1e6), 
                        nrow = 1000, 
                        ncol = 1000)

random_df <- as.data.frame(random_matrix)
random_df$class <- as.factor(rbinom(1000, 1, 0.5))

training <- train(class ~ .,
                  data = random_df,
                  method = 'glm')

Following caret's guide for using parallel processing, doParallel, foreach, or doMC are all compatible. I follow the given example and use doParallel.
However, even when I run the following code to explicitly set 2 workers, it still engages 65 cores.
This happens even when I pass in the parameter allowParallel = FALSE.

library(caret)
library(doParallel)

cl <- makePSOCKcluster(2) # 2 workers
registerDoParallel(cl)

random_matrix <- matrix(rnorm(1e6), 
                        nrow = 1000, 
                        ncol = 1000)

random_df <- as.data.frame(random_matrix)
random_df$class <- as.factor(rbinom(1000, 1, 0.5))

training <- train(class ~ .,
                  data = random_df,
                  method = 'glm',
                  trControl = trainControl(allowParallel = FALSE))
)

I don't think this issue relates to the caret package specifically, since I have run into this problem in a completely unrelated package (e.g. DESeq2).

I'm wondering if somewhere there has been a parallel backend "secretly" registered for 65 workers that I'm unaware of. I am at a complete loss for trying to find it and set it properly. I have tried explicitly setting the workers in BiocParallel, future, foreach, doParallel, and doMC but nothing seems to make a difference.

I also wonder if it may relate to this: https://github.com/HenrikBengtsson/Wishlist-for-R/issues/94

How would I go about tracking down this problem? Is there a way to completely reset my parallel parameters?

If it helps, I am running R 3.6.3 on an Ubuntu 18.04 server. Here is my sessionInfo()

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] doParallel_1.0.15 iterators_1.0.12  foreach_1.5.0     caret_6.0-86      ggplot2_3.3.0     lattice_0.20-41  

loaded via a namespace (and not attached):
 [1] rstan_2.19.3         tidyselect_1.0.0     xfun_0.13            purrr_0.3.4          reshape2_1.4.4       splines_3.6.3        colorspace_1.4-1     vctrs_0.2.4          generics_0.0.2      
[10] stats4_3.6.3         loo_2.2.0            survival_3.1-12      prodlim_2019.11.13   rlang_0.4.6          pkgbuild_1.0.7       ModelMetrics_1.2.2.2 pillar_1.4.3         glue_1.4.0          
[19] withr_2.2.0          matrixStats_0.56.0   lifecycle_0.2.0      plyr_1.8.6           lava_1.6.7           stringr_1.4.0        timeDate_3043.102    munsell_0.5.0        gtable_0.3.0        
[28] recipes_0.1.12       codetools_0.2-16     inline_0.3.15        knitr_1.28           callr_3.4.3          ps_1.3.2             class_7.3-17         fansi_0.4.1          Rcpp_1.0.4.6        
[37] scales_1.1.0         ipred_0.9-9          StanHeaders_2.19.2   gridExtra_2.3        stringi_1.4.6        processx_3.4.2       dplyr_0.8.5          grid_3.6.3           cli_2.0.2           
[46] tools_3.6.3          magrittr_1.5         tibble_3.0.1         crayon_1.3.4         pkgconfig_2.0.3      ellipsis_0.3.0       MASS_7.3-51.6        Matrix_1.2-18        prettyunits_1.1.1   
[55] data.table_1.12.8    pROC_1.16.2          lubridate_1.7.8      gower_0.2.1          assertthat_0.2.1     rstudioapi_0.11      R6_2.4.1             rpart_4.1-15         nnet_7.3-14         
[64] nlme_3.1-147         compiler_3.6.3      

Thank you!

David

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.