It's taking much longer time for same script after R upgrade to 4.0

Hello.

Recently I upgraded the R from 3.6 to 4.0, and also upgraded the RStudio.

I have a script which has 2 for loops, which runs 30, 60000 for 2 for loops respectively. Before the upgrade it took about 10 minutes, but now it takes more than an hour.

I'm running the exactly same script with exactly same input. Unfortunately, removing R 4.0 did not solve the problem, still taking much longer than 10 minutes.

All functions used were just base R functions.

Is there anybody experiencing similar issue? Would there be any way to solve this problem?

Thank you.

If removing R didn't change the runtime, my intuition says it's likely to be something unrelated to R 4.0.0. Have you checked the CPU, memory, and disk usage? Are you sure the size of the loops haven't changed?

By the way, what are you doing in base R that is taking so long? Can you refactor the loops into vectors?

Thank you for the reply. The code is as follows;
"final" is a matrix which has about 60000 rows and 98 columns.

aa = matrix(NA, nrow = nrow(final), ncol=34)

st = Sys.time()
for (j in 1:32){
for (i in 1:nrow(final)){
if (final[i,3*j] > 1.5 & final[i,3*j+1] < 0.05 & final[i,3*j+2] < 0.05){
aa[i,j+2] = 1
} else if (final[i,3*j] < -1 & final[i,3*j+1] < 0.05 & final[i,3*j+2] < 0.05){
aa[i,j+2] = 2
}
}
}
ed = Sys.time()
ed-st

As you see here there is nothing more than base R functions inside the for loops. However I would be grateful if there is a way not to use for loop.

I'm not sure if there was any difference in CPU, memory or disk usage before and after I update R, but thanks for the comment.

Thank you very much.

I don't feel like I have a superpowered laptop, but I got that code you sent to run within 2.166 seconds, with randomly generated values, not real ones of course. I'm not sure what is taking ten minutes to an hour with this code. Are you sure there are only ~60,000 values here, or is that the number of rows you are expecting?

In any case, I've replaced it with a somewhat vectorized version here. It's not the kind of code I would write, because I think it's a little odd you seem to be working with triples of values in the matrix - like columns 3, 6, 9, etc, mean something different from columns 4, 7, 10, etc - but that's perhaps another question. It's about 13.5x faster, reducing 2.16s to .16 seconds.

By the way, you can mark off code sections with ``` above the first and below the last line.

nrow <- 60000
ncol <- 98
final <- matrix(rnorm(rows * ncol), nrow = nrow, ncol = ncol)


# code you supplied
aa = matrix(NA, nrow = nrow(final), ncol=34)

st = Sys.time()
for (j in 1:32){
for (i in 1:nrow(final)){
if (final[i,3*j] > 1.5 & final[i,3*j+1] < 0.05 & final[i,3*j+2] < 0.05){
aa[i,j+2] = 1
} else if (final[i,3*j] < -1 & final[i,3*j+1] < 0.05 & final[i,3*j+2] < 0.05){
aa[i,j+2] = 2
}
}
}
ed = Sys.time()
ed-st
#> Time difference of 2.16 secs

vectorized <- matrix(NA, nrow = nrow(final), ncol = 32)

st <- Sys.time()
places_with_1 <- final[, seq(3, 96, by = 3)] > 1.5 & final[, seq(4, 97, by = 3)] < 0.05 & final[, seq(5, 98, by = 3)] < 0.05
places_with_2 <- final[, seq(3, 96, by = 3)] < -1 & final[, seq(4, 97, by = 3)] < 0.05 & final[, seq(5, 98, by = 3)] < 0.05

vectorized[places_with_1] <- 1
vectorized[places_with_2] <- 2

# attach two sets of NAs on the end
vectorized <- cbind(matrix(NA, nrow = nrow(final), ncol = 2), vectorized)

ed <- Sys.time()
ed - st
#> Time difference of 0.16 secs

# verify the results are the same:

identical(vectorized, aa)
#> TRUE

I'm happy to answer any follow-up questions you might have!

1 Like

Thank you for your intuition. It looks like your idea is working.

In fact I'm quite new to this Rstudio forum, so I didn't know how to make code sections here.

Thanks a lot.

Hi,

I have a question regarding the slower process since I'm experiencing something similar since the upgrade as well.

When the code is running slow, do you always see the red stop-sign in the top-right corner of the console? I noticed that for me sometimes code should be running (as RStudio is busy) but the red stop-sign is not showing. Then after a while it suddenly runs the code.

I noticed that this slowness is not happening if I run my script in just base-R (not using RStudio). If you experience similar issues, it might be a bug we need to report back.

PJ

Hi, in my case I think I always saw the red stop-sign when I ran the above code. However, I do have experience of red stop-sign not appearing but Rstudio not being able to perform any more codes; in this case ">" sign is missing in the R console below Rstudio. If I wait for a while ">" appears again and Rstudio works fine.

This actually happened before upgrading, from time to time. I'm not sure if it will or won't happen after the upgrade.

Hope this helps.

Hi,

And have you tried running your script in the base-R console (so not RStudio) and see if there's a difference?

PJ

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.