I am having a pretty long multi-stage piece of R code, which takes a lot of resources and time to run.
In the development stage, I broke it down into multiple stages, a separate R script for each stage. Each R script to start with loading an
.RData file from the previous stage, and save another
.RData file in the end for the next stage.
Basically, I'd first write the import part, save it. Then write the tidying part, save it. Then write the modeling part. Save it. And so on. It saved me a lot of time at the stages closer to the end.
Now that the code is almost done and ready to work, I am wondering:
- if I should stitch it all back into one R script and remove the intermediate "checkpoints" (create one reproducible piece of code) VS keep it separate (since each stage produces some tangible result and potentially useful data, it would be easier for someone to pick up at any chosen stage instead of recreating everything from scratch)
- in general, if such approach of breaking down long code is a good practice to begin with, and if there is a better approach.
- if using
cvsis frown upon (it was easier to save and later load
.RDataobjects, as they preserve everything.