I am looking for best practices for working with RStudio Open Source installed on a cluster on (Azure) Databricks. Locally I currently organize my R code in projects, which is best practice for a local machine. However, is this also the suggested way of working with R code on Databricks given that you want to use RStudio Open Source as the IDE instead of the Databricks notebook IDE?
Questions / Discussion points:
- Should I organize my R code in projects?
- Where should I save my R code?
- Other tips on how to work with RStudio Open Source on Databricks? For example, what are the pros and cons of installing packages via the Databricks UI versus
install.packages()?
This introductory video on how to use RStudio on Azure Databricks is somewhat useful, but it does not discuss the points that I have listed above.
In general, my impression is that Databricks provides much less practical information and code examples on how to use their platform with R compared to Python and Scala, which I think is a shame for the R community.
Any comments, suggestions, and links to resources are most welcome.