R studio run really slow

Hi,

For anyone who can help me out! I am struggling with this situation.

Somehow my R studio just runs the dplyr package with function 'summarise_at' really slow. The problem is solved for two weeks after I update the R studio but it just comes back again. I try to reinstall R and R studio as well as reinstall my computer. Nothing can really help. The function still works but with an extremely long time.

Thanks.

Forgot to say. This thing also happens for my aggregate function. I try to use the aggregate function to avoid the summarise_at. But it runs really really slow.

Hi Mingmei, welcome!
Do you have the same problem if you run your code in R (not in RStudio)? What are your versions of R, RStudio, dplyr and OS? Any chance that you could make a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

1 Like

Thank you so much for your advice. Briefly introduction, I do test it on R also and it runs really slow as well. But last time, the problem is solved by updating R studio no the R, that's why I come here for help. My version of R is 3.5.3 and the version of my R studio is 1.2.1335. I use a Windows 10 computer and last week I run the Novabench score test it's still fine.
I try to create some reproducible example with the same dimension as the data I use. So how the simple example (mydata and test) doesn't have a problem but with the real data RX part. The problem still exits. I wish I could upload the data online for you to check since I'm desperate to fix this problem. LOL. Please let me know if I could or maybe paste the data on the forum. Thank you.

mydata<-as.data.frame(replicate(100,rnorm(100000,0,1)))

mydata$group=rpois(100000,5)
mydata$group.2=rpois(100000,500)

test <- mydata %>%
group_by(group,group.2) %>%
summarise_at(vars(V1:V10),sum)%>%
ungroup

RX <- RX %>%
group_by(DUPERSID,EVNTIDX) %>%
summarise_at(vars(RXXP[i]),sum) %>%
ungroup

New updates, to 100% sure match the two cases. I numeric the DUPERSID and EVNTIDX from factor to number and the problem is solved. However, I don't think this is a desirable way to do that. BTW, my DUPERSID has a format like 60001101 and EVNTIDX has a similar but longer format 600011011361.

On my desktop, creating mydata does feel a little slow. It has 10mil values in it.

Have a look at the reprex below. (On making a good reproducible example, note that it's important to include any libraries you call upon. Also note I needed to hide the line referring to RX object since it was never set-up before it was called. =)

library(dplyr)
system.time({
  mydata<-as.data.frame(replicate(100,rnorm(100000,0,1)))
})
#>    user  system elapsed 
#>   0.779   0.101   0.886

system.time({
  mydata$group=rpois(100000,5)
  mydata$group.2=rpois(100000,500)
})
#>    user  system elapsed 
#>   0.010   0.001   0.011

system.time({
test <- mydata %>%
  group_by(group,group.2) %>%
  summarise_at(vars(V1:V10),sum)%>%
  ungroup
})
#>    user  system elapsed 
#>   0.023   0.001   0.024

# RX <- RX %>%
#   group_by(DUPERSID,EVNTIDX) %>%
#   summarise_at(vars(RXXP[i]),sum) %>%
#   ungroup

Created on 2019-04-16 by the reprex package (v0.2.1)

In terms of speeding up R,

John Mount explores some options here:

1 Like

Sorry for late respoding and thank you for the reponse. Acutally, I figured the reason out. The thing is that my original data (real data:RX from MEPS dataset) has numeric data with labels. Eg:
AMOUNT PAID, PRIVATE INSURANCE (IMPUTED)
[1] 0 0 0 0 0 0 0 0 0 0

If I move the label and trun everything into numberic value as the example I created, the computational time is back to normal.

If your question's been answered (even if by you), would you mind choosing a solution? (See FAQ below for how).

Having questions checked as resolved makes it a bit easier to navigate the site visually and see which threads still need help.

Thanks

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.