At my new job, we’re thinking through the best way to store, share, and structure our shared research projects and hoping for some input on what works for other folks.
The basic idea is that for each project we work on, we have multiple analysts that may contribute and review code. The data for these projects cannot be shared publicly, so we don’t want to track it in git (even with a private GitHub repo). What are the best approaches for giving analysts access to the data in their local dev environments? Most advice I’ve read assumes that the code and data live in the same project folder, but I don’t think that is what we’re aiming for.
One thing we have considered is storing the data in SharePoint/OneDrive and using Microsoft365R to access. In this way, the data isn’t stored in the same project folder as the analysis code, but we can write code to access the data files that should work for everyone. We also could write out cleaned versions of the data or other output to SharePoint/OneDrive that we similarly do not want tracked in git. Does this seem reasonable? Or is it over complicating things by having separate storage locations for data and code?