Follow-Up Thread for "Reproducibility in Production" Webinar

REPRODUCIBILITY IN PRODUCTION

A webinar presentation by Thomas Mock and Garrett Grolemund

Live on September 4th, at 11:00 AM - 12:00 PM EDT

Executable documents offer limitless opportunities for your business. With them, your consumers can rerun your report with new parameters, apply your analysis to new data, or schedule future, automatic updates to your work—all with the click of a button. This is the first in a three part webinar series that will describe this new form of reproducibility. Here, we begin by showing you how to write executable R Markdown documents for a production environment.

6 Likes

Thanks to Garrett for the great presentation. I know now that this series of webinars will not cover the pagedown package. But since others may be in my boat, I thought I'd ask a related question here.

As part of my job, I produce a number of PDF reports to be distributed to a variety of stakeholders. Many readers are somewhat computer illiterate. I think their heads might explode if they are emailed an HTML file instead of a PDF.

Currently, I use a Word template to knit R Markdown documents into report formats, but this limits how much I can customize the appearance of certain content, especially tables of which there are many.

I know that the default PDF method for R Markdown is LaTeX, but customizing LaTeX is complicated and arcane.

So I've set my sights on the pagedown package linked above. Particularly, the executive summary example highlighted in the pagedown readme. Unfortunately, there is no source available for this example (and hasn't been for many months). Does anyone else have examples of pagedown examples like this? Any guidance would be greatly appreciated.

I realize this may merit its own post, so I am willing to do that if needed. Just thought the webinar audience might share some similarity to me.

Sorry, work got in the way of me watching this webinar. Has a link been provided to this webinar recording yet? Thank you.

1 Like

Yes, sir. I've added the link above.

1 Like

Webinar Follow-Up Questions

Can you please repeat the difference between "r setup" versus "r" for the r code chunks

by Mock Thomas - The R Setup chunk is run before all other chunks and is used for setup/configuration steps. For more details see https://r4ds.had.co.nz/r-markdown.html#chunk-name


How to run a rmarkdown documnet in command line with parameters?

by Garrett Grolemund - Suppose your document takes a parameter named num and a parameter named int, then you could supply values to the parameters at the command line like this: rmarkdown::render("your-doc.Rmd", params = list(num = 1.5, int = 2))


Have any surveys been done on what % of data scientists use computational documents vs those that don't?

by Garrett Grolemund - That's a very interesting question. I don't know of any, but I think it would be a bit early. My impression is that computational documents are just beginning to coalesce as a category that people talk about.


How do you share markdown with non techinical people who don't use github (I guess email the html file?)

by Mock Thomas - You can host RMarkdown as HTML on Connect which provides security and gives non-technical users a way to access and re-generate reports with full security. You could also email the knit report as HTML to a colleague to open locally.


Where do you store data? (presuming static data, but too large for git)

by Garrett Grolemund - You can store data anywhere that you would normally access it from with R code. Then put that R code in one of your code chunks (probably the setup chunk). This would let you keep your data in a database, a spark cluster, behind an API, or more. I work with small data sets and prefer to keep the data alongside my .Rmd.


Can I use R connect or shiny to build a dashboard for several clients?

by Mock Thomas - RStudio Connect is a central hosting platform that provides both server and app-level security and sharing. So you could share documents to specific groups or or limit access.


Can you pass parameters from a function - e.g., I'm trying to produce a series of reports based on a list of parameters

by Garrett Grolemund - You bet. Suppose your document takes a parameter named num and a parameter named int, then you could supply values to the parameters at the command line like this: rmarkdown::render("your-doc.Rmd", params = list(num = 1.5, int = 2))


If a user has interacted with a document using Shiny, can his version be rendered as a pdf or Word document?

by Garrett Grolemund - Sure, so long as you build the functionality to do that into your app. Here are some examples: https://shiny.rstudio.com/gallery/generating-reports.html, https://shiny.rstudio.com/gallery/download-knitr-reports.html


Can the code in html be hidden but a user still can view it if wanted (like show/hide button)?

by Garrett Grolemund - Yes. This is the default display when you set the output field in the header to output: html_notebook. To see this in action go to File > New File > R Notebook in the RStudio IDE. Then click the Preview button above the document that opens.


Is there a way to have your code chunk in R Markdown publish in your document but a user could collapse or expand the code chunk if they are interested in seeing the code or keep it hidden?

by Garrett Grolemund - Yes. This is the default display when you set the output field in the header to output: html_notebook. To see this in action go to File > New File > R Notebook in the RStudio IDE. Then click the Preview button above the document that opens.


Can you please mention the differences between R Markdown and R Notebook

by Garrett Grolemund - R Notebooks are R Markdown documents with a different set of defaults. The most important default difference is this: R Markdown documents re-run all of your code in a fresh session each time you render the document. This can be expensive if you have big data or long-running code. R Notebooks expect you to run each code chunk as you write it. They cache the results and compile the most recent cached results into a complete document when you render/preview, without re-running everything.


Beside the "format" page that you showed at the beginning of the webinar, can you suggest any other online resource where I can learn about output formats of an R markdown document? I'm specifically interested in scientific journal-related formats.

by Garrett Grolemund - This is a useful book chapter about using R Markdown for scientific journals. It mentions the rticles package, which is also helpful. Also https://rmarkdown.rstudio.com/formats.htmlhas a section on journal formats.


How do we handle reproducibility when package version changes may cause results to change? Can we force a certain version of each package?

by Garrett Grolemund - RStudio Package Manager is built specifically for package versioning and management, https://www.rstudio.com/products/package-manager/. You can also use the renv package to create a reproducible versioning system for your project, https://rstudio.github.io/renv/articles/renv.html


In RStudio Connect - is there a dropdown of available parameters to select from, or would Tareef need to know that "Aspirin" was an option?

by Mock Thomas - You can have set parameters to choose from - the end user doesn't have to memorize all the options


Does multi-language support apply to inline code in addition to code chunks? (i.e. python 1 +1?

by Garrett Grolemund - At the moment, no.


is it possible to restate what functionality you can add in terms of interactivity that can be captured without an active R session? e.g. shiny/html widgets require an active session yes?

by Garrett Grolemund - htmlwidgets do not require an active R session, but Shiny does. "Do you use Shiny?" is analogous to asking "Do you need an active R session?"


can one of the parameters be the current username? can rstudio get the current username?

by Garrett Grolemund - Shiny can get the username as session$user (see https://docs.rstudio.com/connect/user/plumber.html#user-meta-data-1) so this might open some options to do what you are suggesting within a Shiny app. Otherwise, I think you'd need to set up some initial code in the R session that generates the Rmd. For RStudio Connect, you don't have access to that R session as far as I know.


what is the best way to start learning R for experienced programmers?

by Garrett Grolemund - If you want to use R to start analyzing data, I recommend https://r4ds.had.co.nz/ or https://rstudio.cloud/learn/primers. If you want to learn to write and debug programs in R, I recommend https://adv-r.hadley.nz/


Is there a simple way to set echo = FALSE for the whole document?

by Mock Thomas - You can use knitr::opts_chunk$set to set the global echo = FALSE (or other settings) https://rmarkdown.rstudio.com/lesson-3.html


What developments in R markdown are you most excited about in the future (or that may be in development)?

by Garrett Grolemund - I'm most excited about scheduling reports to update themselves with new data and to then automatically email themselves to stakeholders if anything has changed. This is a feature of RStudio Connect, and it amounts to hands-free Statistical Process Control (and much more)!


Can you give examples of shiny apps created and being used by business

by Garrett Grolemund - https://www.rstudio.com/products/shiny/shiny-user-showcase/ is a collection of apps whose owners have offered to let us share their work. It features apps used in business, but is not limited to businessthem


Do one usually create the analysis with regular R scripts and than convert to .rmd?

by Garrett Grolemund - No, there's no reason to begin with an R Script. You can develop the code in an .Rmd. This is particularly useful if you enjoy working with a cell based/notebook-like interface. I reserve R scripts for packages and longer programs where I suspect I'll need to use the IDE's debugging tools.


For report documenting within Rmarkdown, is it possible to customize and add a title page, insert table of contents?

by Mock Thomas - Hi Benjamin - you can add a table of contents that automatically generates based on header level. See https://bookdown.org/yihui/rmarkdown/html-document.html


Could you share today's slides or video? Thanks!

by Gladu Elisa, Mock Thomas - Hey there - Yes, the webinar will be recorded and shared in a few days.,All slides will be shared on our Webinars GitHub https://github.com/rstudio/webinars


How does rmarkdown handle htmlwidgets when rendering into a PDF and HTML? Will the PDF output not render?

by Garrett Grolemund - If you render the file with an htmlwidget to pdf, R Markdown will return an error message that asks you to add always_allow_html: yes to your YAML header. If you add this, R Markdown will render the pdf and try to use phantomJS to embed a static screenshot of the widget in the widget's place.


Does html render for most individuals?

by Mock Thomas - html is the default for R Markdown formats


When you have interactivity, does the data have to be on a server / available for a gerneral user to interact with?

by Garrett Grolemund - Interactive documents do not have to be served on the internet for the general public. You can host them behind a firewall on a server that you contros, or you can use shinyapps.io, Shiny Server Pro, or RStudio Connect to share an app with only people authorized to see that specific app.


how would you generate both reports as separate documents. but not edit/run each time

by Mock Thomas - You can use rmarkdown::render() to generate multiple reports with either different parameters (inputs) or different output formats, like render("input.Rmd", c("html_document", "pdf_document")


This topic was automatically closed after 42 days. New replies are no longer allowed.