Failure of glm to converge, difference between running in RStudio and R

When performing regression analysis on relatively large data IN Rstudio I am getting the following warning and non-sensical results.

Warning message: algorithm did not converge

But, when I run the same code in R interactively, the model works without issue. The sessionInfo() between the two environments are identical and I have listed it below.

Sample Code:
Works in R interactively, but gives weird results in RSutdio. If I reduce the sample size to 4000, the regression works without issue in either environment.

# set seed and create sample data
true_class <- factor(sample(paste0("Class", 1:2), 
                            size = 4096,
                            prob = c(.2, .8), replace = TRUE))
true_class <- sort(true_class)

class1_probs <- rbeta(sum(true_class == "Class1"), 4, 1)
class2_probs <- rbeta(sum(true_class == "Class2"), 1, 2.5)

test_set <- data.frame(obs = true_class,
                       Class1 = c(class1_probs, class2_probs))

# run regression analysis <- glm(obs~Class1, data = test_set,
                    family = "binomial")

I am using the most updated version of RStudio server, Version 1.2.5019

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS/LAPACK: /opt/intel/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64_lin/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.6.1

I would greatly appreciate any help on this issue.


There's an all-in, hard 1GB memory limit in RStudio Server, which probably explains the difference. See sean's post

I don't think that explains the discrepancy. The post you linked is regarding uploading a file limit while uploading to RStudio Cloud. I am not using RStudio Cloud and my data object is considerably smaller than 1GB. I don't see how the two can be related. Thanks for your response.

If it's desktop, rather than cloud, you're right, not the problem.

Just as a check, the code works identically under Ubuntu 19.10 and Mojave in both the R gui and RStudio IDE. So, it's hard to imagine why it's not in your situation. :frowning_face:

Is it possible that a different BLAS implementation is being chosen for R in each case? Perhaps one behaves better than the other here?


So, it's a $PATH problem? If so, does it arise only because the R gui and RStudio users differ? If not, why would they differ, and what configuration is necessary to conform RStudio's path to R guis?

Thanks for your message. That was our suspicion and I have subsequently reverted back to a default BLAS option and the issue is resolved.

However, what still concerns me is that in both RStudio Server and R, on the same workstation, sessionInfo() pointed to the same intel MKL BLAS/LAPACK libraries, but the error was only occurring when running the code through RStudio Server.

Have you seen issues where RStudio interacts differently than R with BLAS/LAPACK libraries?



Passing strange. Hopefully, they’ll be able to reproduce and run down a fix

To the best of my knowledge, we have not -- so this is a bit surprising to me.

If you wanted to investigate further, you could try using e.g.

# get open files for rsession
lsof -p `pgrep -x rsession`

# get open files for R
lsof -p `pgrep -x R`

to compare what files (libraries) are open and used by each process, and see what differences there are.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.