R studio run really slow

dplyr
rstudio
#1

Hi,

For anyone who can help me out! I am struggling with this situation.

Somehow my R studio just runs the dplyr package with function 'summarise_at' really slow. The problem is solved for two weeks after I update the R studio but it just comes back again. I try to reinstall R and R studio as well as reinstall my computer. Nothing can really help. The function still works but with an extremely long time.

Thanks.

0 Likes

#2

Forgot to say. This thing also happens for my aggregate function. I try to use the aggregate function to avoid the summarise_at. But it runs really really slow.

0 Likes

#3

Hi Mingmei, welcome!
Do you have the same problem if you run your code in R (not in RStudio)? What are your versions of R, RStudio, dplyr and OS? Any chance that you could make a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

1 Like

#4

Thank you so much for your advice. Briefly introduction, I do test it on R also and it runs really slow as well. But last time, the problem is solved by updating R studio no the R, that's why I come here for help. My version of R is 3.5.3 and the version of my R studio is 1.2.1335. I use a Windows 10 computer and last week I run the Novabench score test it's still fine.
I try to create some reproducible example with the same dimension as the data I use. So how the simple example (mydata and test) doesn't have a problem but with the real data RX part. The problem still exits. I wish I could upload the data online for you to check since I'm desperate to fix this problem. LOL. Please let me know if I could or maybe paste the data on the forum. Thank you.

mydata<-as.data.frame(replicate(100,rnorm(100000,0,1)))

mydata$group=rpois(100000,5)
mydata$group.2=rpois(100000,500)

test <- mydata %>%
group_by(group,group.2) %>%
summarise_at(vars(V1:V10),sum)%>%
ungroup

RX <- RX %>%
group_by(DUPERSID,EVNTIDX) %>%
summarise_at(vars(RXXP[i]),sum) %>%
ungroup

0 Likes

#5

New updates, to 100% sure match the two cases. I numeric the DUPERSID and EVNTIDX from factor to number and the problem is solved. However, I don't think this is a desirable way to do that. BTW, my DUPERSID has a format like 60001101 and EVNTIDX has a similar but longer format 600011011361.

0 Likes

#6

On my desktop, creating mydata does feel a little slow. It has 10mil values in it.

Have a look at the reprex below. (On making a good reproducible example, note that it's important to include any libraries you call upon. Also note I needed to hide the line referring to RX object since it was never set-up before it was called. =)

library(dplyr)
system.time({
  mydata<-as.data.frame(replicate(100,rnorm(100000,0,1)))
})
#>    user  system elapsed 
#>   0.779   0.101   0.886

system.time({
  mydata$group=rpois(100000,5)
  mydata$group.2=rpois(100000,500)
})
#>    user  system elapsed 
#>   0.010   0.001   0.011

system.time({
test <- mydata %>%
  group_by(group,group.2) %>%
  summarise_at(vars(V1:V10),sum)%>%
  ungroup
})
#>    user  system elapsed 
#>   0.023   0.001   0.024

# RX <- RX %>%
#   group_by(DUPERSID,EVNTIDX) %>%
#   summarise_at(vars(RXXP[i]),sum) %>%
#   ungroup

Created on 2019-04-16 by the reprex package (v0.2.1)

In terms of speeding up R,

John Mount explores some options here:

1 Like

#7

Sorry for late respoding and thank you for the reponse. Acutally, I figured the reason out. The thing is that my original data (real data:RX from MEPS dataset) has numeric data with labels. Eg:
AMOUNT PAID, PRIVATE INSURANCE (IMPUTED)
[1] 0 0 0 0 0 0 0 0 0 0

If I move the label and trun everything into numberic value as the example I created, the computational time is back to normal.

0 Likes

#8

If your question's been answered (even if by you), would you mind choosing a solution? (See FAQ below for how).

Having questions checked as resolved makes it a bit easier to navigate the site visually and see which threads still need help.

Thanks

0 Likes

closed #9

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

0 Likes