R Markdown vs R Scripts in Enterprise Workflows [Excel -> R Meetup Q&A]

Here are a few related questions from the meetup, Meetup: Making the Shift from Excel to R: Perspectives from the back-office

It'd be great to hear your thoughts in the comments below as well!

Are you suggesting that all our code be written in R Markdown and when do you use R scripts?

Mandip: I use regular R scripts as part of a larger process if I’m bringing in a whole lot of data for a trading application but will typically start off in an R Markdown document. Sometimes the team will come back to me to ask for this in a script for x, y, z reason. Whether or not that’s good practice, I wouldn’t be able to comment on that.

Why would you want to move an R Markdown to a script?

Mandip: Take for example, as a business analyst, part of my job is to write up requirements for a developer to put something into production. I would start by writing in RMD, explaining the logic, explaining where I’m coming from but at some point, my code might not be as efficient for putting this into production. That was their particular preference of how I do that.

Are you suggesting that all our code be written in R Markdown and when do you use R scripts?

Tony: Perhaps it would be helpful to put that into a package. When I’m developing, everything gets made into a function which gets made into a package. My script even if I have to share it is a single line that says to run this function which is composed of other functions.

Mandip: That’s a very valid point. At our organization, we’re not at that point yet but I might try to implement that

Why would you want to move an R Markdown to a script?

Vasant: I have a point to add to that. In some cases, you might want to run this at a command line or give it as an executable to a vendor. We do that in the biology lab, I make a command line script so people can just import whatever data they have. They don’t want to necessarily install R, install a package, or deal with any of that and that’s an easy way to push things into production as well. If you want to have a bunch of scripts in your pipeline, I do that a lot where we use little R (package) so that you can use an r script as an executable. If it’s going to get more complicated, definitely use a package because that way you can document it, have help, have functions documented but that’s one example where you might not want to have a package. As Mandip said, I’ll start the RMD and then knit the script from that, make it an executable and use that.

David: Keeping with the conversation of packages and the R ecosystem, I had an interesting experience with the first R script I wrote. The person I asked to review my code said put it in a package and put some comments in. He said he comes across so much R code without documentation that he doesn't have time to decipher it anymore. I took his advice and my experience was that if you add a few comments to your file, you actually end up making much better code and have much better global namespace management. You can let everyone do whatever they want to do, the coding conventions are now internal to the package and that makes it a lot easier for people to mix and match packages for what they would like to do. As far as your infrastructure, R is extremely flexible - whether you’re using it to feed data or import data from a variety of tools, it just works. Over the past few years, TidyVerse has matured & it's much easier to do.

The above is what I said during the MeetUp. Here are some of the advantages of a package over a script or Rmarkdown document:

  • You can manage which versions of packages to use on a per project basis w/renv
  • You can write tests to verify correct operations w/testthat and testthis
  • All the services that use rsconnect, like Shiny and RPubs, will automatically install your package if needed.
  • Partition your code into public & private components.

One of my Shiny apps has a part that is based on published work and released under the MIT license. It also has a private part because there may be potential commercial interest.

I think that quick, targeted iterations over code are more important than its initial form. In other words, if you feel comfortable starting w/an R script, do it. As you gain experience, you'll develop a better sense of how to split the project between R scripts and packages.

1 Like