I'm working with RStudio Server that's hosted by a big linux box. It seems to me like I should be able to work with a big amount of data. The box has 120GB+ memory and multiple CPU.
I'll share how I'm working with my datasets (~10GB) and where I'm having trouble.
I work with with data table and dplyr. I'm doing basic operations, like mutate, group by and summarize.
Yet, my code is often very slow and sometimes takes down the 120GB cluster.
What are other ways to work with big datasets in RStudio Server? Do I need to spin up a DB?