Organizing data analysis packages



Thoughts on best practices for organizing data analysis functions, scripts, and settings files for a data analysis workflow?

My current workflow looks like this:

  • Create a set of documented functions in the R folder
  • Create a "settings.R" file with filters, environment variable locations, dates, database information, etc.
  • Create a "script.R" or "script.Rmd" file that calls these functions and generates the output (typically in a set order)

Should I wrap my script and settings .R files inside functions too? Put them in a different directory? Curious as to how others manage this workflow.


Have you seen this thread: Data Science Project Template for R ? Seems like it might have some relevant thoughts on the topic of organising projects. I find that adapting the R pkg structure works well for my data analysis work. But there are many other approaches that seem to work too.