Structuring repos/projects with a package, a plumber, some reports and a shiny app or two....

slodge · February 20, 2023, 2:00pm

From R Packages (2e) - 4 Fundamental development workflows we've seen the advice:

A directory that is an RStudio Project will contain an .Rproj file. Typically, if the directory is named “foo”, the Project file is foo.Rproj. And if that directory is also an R package, then the package name is usually also “foo”. The path of least resistance is to make all of these names coincide and to NOT nest your package inside a subdirectory inside the Project. If you settle on a different workflow, just know it may feel like you are fighting with the tool

Within our teams we are very often finding that we're working on small, very domain specific sets of functions, along with scripts, notebooks, reports, shiny-apps and plumber apis which are very closely tied to those functions.

In these situations, we've gotten into the habit of putting:

all those connected things in a single git repo (which ensures they get versioned together),
using a /package folder inside that repo for the functions (using DESCRIPTION, /R, /man inside there)
using top level /scripts, /reports, /apps and /apis folders for the consumers of the package functions
using renv on the entire repo

This seems to be working OK for us - especially as we've got build pipelines which auto-increment the package version number, deploy that to our local CRAN, and then auto-deploy the reports and apps using that latest checked-in package.

The only tool fighting we seem to be doing is that people do have to remember to use e.g. devtools::load_all("package") rather than just devtools::load_all()

But I'm wondering:

what "better" ways of structuring this are there?
what tools should we be fighting that we clearly are missing?

Gabor · February 21, 2023, 9:35am

Why not have DESCRIPTION, man/ and R/ at the top level? Are there multiple packages in the same repo?

slodge · February 21, 2023, 11:11am

We do have one repo (out of many) with multiple packages, but generally we're keeping it 1:1

I guess we could have the reports, scripts, apps and apis has sub-folders of the package... but they all use the package, rather than being part of it - what's the logic/incentive in putting them inside the package structure/folder?

Gabor · February 21, 2023, 11:22am

That you can call devtools::load_all(), which is a keypress in RStudio, and probably (well, hopefully) nothing else changes in your workflow.

Btw. you can also call load_all() from scripts within the package, and the scripts can be in subdirectories, because load_all() searches in the parent directory recursively.

I have this organization, and it works for me pretty well: containers/website at main · r-hub/containers · GitHub
(Although admittedly, this repo / package is pretty simple at this point, and there are no other subdirectories currently.)

I can declare dependencies in DESCRIPTION and I call load_all() from the quarto files.

slodge · February 21, 2023, 11:46am

But just to check - that's it - the only advantage is using devtools::load_all() rather than devtools::load_all("path_to_package")?

... and if we are using e.g. a reports folder rather than putting everything in the root, then we'll still need to use a path anyway (because knit will knit in the reports folder)?

Gabor · February 21, 2023, 12:03pm

Well, you said that was your only problem, so I tried to solve that:

I don't follow this, sorry. If you have an .Rmd in the reports/ folder, then you can call devtools::load_all() in the .Rmd without specifying the path. load_all() will by default look at the parent directory for DESCRIPTION (recursively) if it cannot find it in the current directory. Isn't this what you want?

slodge · February 21, 2023, 12:25pm

Thanks - didn't know that the path parameter was Path to a package, or within a package. - that's useful to know

I think the overall thing I'm asking about is whether there's something we're missing... When I said "The only tool fighting we seem to be doing" what I meant was "we're not seeing any significant tool fighting"... so I was guessing (perhaps wrongly) that we might be missing tools we could/should be using?

I guess the original R packages text did only say "it may feel like"... so maybe we should just ignore it and carry on

Gabor · February 21, 2023, 1:40pm

To be clear, my advice coincides with the advice from the book: put your package at the top level.

It is not completely intuitive, but it is better to nest your scripts, Rmd documents, etc. inside the package than nesting the package inside your project.

slodge · February 21, 2023, 2:31pm

Thanks

I think a couple of years ago the direction we took was heavily influenced by the layout at the time in GitHub - sol-eng/bike_predict at revamp

Certainly looking at my past posts on here I've definitely referenced it a lot! (Is it possible to use a single renv across multiple projects?, RStudio Connect and private CRAN timing issue?, ...)

Will keep asking and listening about what's "best" and keep evolving our platform alongside our models, packages and apps.

system · April 6, 2023, 2:01pm

This topic was automatically closed after 45 days. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.