R and RStudio are slow on macOS

I was waiting for R 4.1. and native Apple silicon support to dome some benchmarks against other platforms. The results on my MacBook Pro with the M1 chip look disturbing to me. Let's start with the Mac:

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.1.0 tools_4.1.0   

The results from the benchmark are:

> N <- 20000
> M <- 2000
> X <- matrix(rnorm(N*M),N)
> system.time(crossprod(X))
   user  system elapsed 
 49.954   0.109  50.056 

Interestingly, the sessionInfo has different output in R Console:

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.1.0

It seems to me that R uses Apple Acclerate framework's BLAS libraries, but the benchmarks are similar:

> system.time(crossprod(X))
   user  system elapsed 
 49.909   0.117  50.015 

Under Windows using my Thinkpad E 580 it is a whole different story:

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] microbenchmark_1.4-7 RevoUtils_11.0.2     RevoUtilsMath_11.0.0

loaded via a namespace (and not attached):
[1] compiler_4.0.2  tools_4.0.2     grid_4.0.2      lattice_0.20-41

The computations are much quicker:

> system.time(crossprod(X))
   user  system elapsed 
   2.60    0.03    0.70 

Windows uses Microsoft R Open and that may explain the difference. On Ubuntu or Fedora, using OpenBlas on theThinkpad, the results are similar to Windows. I don't know if this is to be expected. I'm worried that the macOS R is inexplicably slow.

I was wrong that in my case R uses the faster vecLib libraries. From the R for Mac OS FAQ:

Currently the default is to use the R BLAS: this is recommended for
precision

This is the case if one installs the regular R from CRAN. The solution is to change to the Appleā€™s Accelerate BLAS:

cd /Library/Frameworks/R.framework/Resources/lib
ln -sf /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib libRblas.dylib

as explained in many sources, for example here.

With changed BLAS the MacBook is much faster than the Thinkpad:

> system.time(crossprod(X))
   user  system elapsed 
  0.491   0.059   0.553

I guess I fooled myself because on Fedora the openblas is enabled by default and I was expecting the same. Also, one must read documentation.