git backed content... what is the manifest actually used for?

I think I'd somewhat misunderstood how git backed deployments work...

I'd thought the git deployments would only trigger when the manifest.json files were updated - but it seems like the git deployments happen when any changes happen on the branch. Also I thought the manifest would be checked for a list of files and their hash codes... but actually these to be ignored?

How I found this... If I look inside an app folder like /var/lib/rstudio-connect/apps/7/29/ then it seems like the deployed app contains more files than just the files mentioned in the manifest - e.g. if I add files inside an R folder than those files trigger a deployment and get deployed.

Is there any more info available about how git backed deployments work beyond the docs on
https://docs.rstudio.com/connect/1.7.6/user/git-backed.html ?

I'm especially wondering about whether I can exploit this behaviour to make some deployments simpler for our less technical developers - can I rely on the fact that the hash codes and file lists in manifest.json don't matter for the git backed case? Can I rely (for a major version at least) on all folder files to be deployed alongside each app?

From an admin perspective I guess I'm also a bit surprised that if the manifest and files get out of sync, then I don't even get a warning alert about this - the deployments seem to just succeed regardless?

Hey @slodge !! So sorry for the delay getting back to you here!

You are spot on, the manifest.json is way overkill in the information / detail that it tracks about files and such. Although we are thinking about and planning to add a way to exclude files at some point, you are exactly right - at present, the Connect deployment mechanism is very git-based (new commits trigger new deployments) and very naive (it includes all files in the directory).

There are several reasons for this. At its simplest, many of the error states you can run into by basing things off of just the manifest, or by depending on hash values inside the manifest, turn out to be very painful for users. Our hope with the current implementation is that it strikes the balance between reproducibility and good UX, with the downside that it is possible to deploy old manifests (i.e. if you forget to update it) and lose reproducibility that way.

For that case, we usually recommend building a pre-commit hook or something of that nature (i.e. a more recent discussion with a similar aim here: Github Action for `rsconnect::writeManifest`) to either warn / error / or generate the manifest for you.

I'm curious to hear how things have progressed since you wrote this message (6 MONTHS AGO!? Where does time go!?) and if this fits your needs or if you have any additional questions / feedback!

We talked with another UK customer about how their CI is setup (they are also an Azure Devops user) and got some inspiration from them....

After that we setup a build system using yaml AZD pipelines, Rocker-derived containers, S3-backed renv restores, devtools package build, S3 CRAN publishing and packrat/connectApi deployments to our connect servers.

It's quite involved but is working quite well :slight_smile:

The use of renv, internal packages and connect aren't friction-free for some of our users (they're used to a global package cache and to source'ing files across very wide repos) but I'm pretty confident it's a good base for the future.

There are a few other things we'd like to add - especially looking forward to being able to control app access during publishing - but for now it's working. Can send more info via email if it helps.

1 Like

I'm glad to hear things are working well for you!

I am definitely curious to hear more - let's definitely follow up over email!