Best practices / suggestions to document Rmarkdown parameters

As the title indicates I am looking for suggestion on to document Rmarkdown params. I am several Rmarkdown documents with multiple (>5) non-trivial parameters, which I want to share with colleagues. For example:

---
title: "My report"
output:
  html_document:
    toc: true
    toc_float: true
    theme: united
    code_folding: hide
date: "`r format(Sys.time(), '%d %B, %Y')`"
author: "`r Sys.info()['user']`"
params:
  res_file: "results.tsv"
  gs_species: "Homo sapiens"
  gs_category: "All"
  gmt_file: NULL
  extra_sets: NULL
  gs_string: NULL
  padjCutoff: 0.1
---

I could add some light description as comments but it will get pretty large pretty fast:

---
title: "My report"
output:
  html_document:
    toc: true
    toc_float: true
    theme: united
    code_folding: hide
date: "`r format(Sys.time(), '%d %B, %Y')`"
author: "`r Sys.info()['user']`"
params:
  res_file: "results.tsv" # [string] Path to results file generated by script `generate_results.R`
  gs_species: "Homo sapiens" # [string] Organisms. Use binomial species nomenclature, currently only Homo sapiens is accepted- `
  gs_category: "All" # [string vector] Which categories, as listed in the site.blah.categories.html
  gmt_file: NULL # [string] Path to file containing things
  extra_sets: NULL # [string] Path to file containing one set per line. Use the set names listed in the site  site.blah.categories.html
  gs_string: NULL # [string] Used for partial matching of sets. When you are too lazy to look the site above.
  padjCutoff: 0.1 # [numerical] Where to draw the line of what matters 
---

Or alternatively use take advantage of Knit with Parameters and use the variable label as descriptor of the parameter: 15.3 Knitting with parameters | R Markdown: The Definitive Guide

Any thoughts are appreciated.

Cheers.

In R, the fundamental unit of shareable code is the package.
https://r-pkgs.org/intro.html

If you're sharing report templates with colleagues, one option is to make a small package to hold those reports. If the end user only needs to set the parameters to render the report — i.e. if they do not need to edit the report contents — you could even provide a wrapper function around rmarkdown::render(). That would let you document the parameters in the same way as function parameters.

#' Render result report
#' 
#' Render a fancy result report
#' 
#' @inheritParams rmarkdown::render
#' @inheritDotParams rmarkdown::render
#' @param res_file Path to results file generated by script `generate_results.R`
#' @param gs_species Organisms. Use binomial species nomenclature, currently
#'   only Homo sapiens is accepted
#' @param gs_category Which categories, as listed in the
#'   <https://example.com/categories.html>
#' @param gmt_file Path to file containing things
#' @param extra_sets Path to file containing one set per line. Use the set names
#'   listed in the site  <https://example.com/categories.html>
#' @param gs_string Used for partial matching of sets. When you are too lazy to
#'   look the site above.
#' @param padjCutoff Where to draw the line of what matters
#' 
#' @return Writes the report to `output_file`
#' @export
render_result_report <- function(
  output_file,
  res_file,
  gs_species = "Homo sapiens",
  gs_category = c("All", "Option 1", "Option 2"),
  gmt_file = NULL,
  extra_sets = NULL,
  gs_string = NULL,
  padjCutoff = 0.1,
  ...
) {
  stopifnot(identical(gs_species, "Homo sapiens"))
  gs_category <- match.arg(gs_category)
  # other input checking
  params <- list(
    res_file = res_file,
    gs_species = gs_species,
    gs_category = gs_category,
    gmt_file = gmt_file,
    extra_sets = extra_sets,
    gs_string = gs_string,
    padjCutoff = padjCutoff
  )

  report_template <- system.file("rmarkdown", "templates", "report", "skeleton", "skeleton.Rmd", package = "myReportPkg")

  rmarkdown::render(
    input_file = report_template,
    output_file = output_file,
    params = params,
    ...
  )
}

Also by storing the report in the package, you get rmarkdown templates for free, but you can still point users to ?render_result_report for information about the expected variables.

And this is just one package-ized option, there are lot of other structures that could be helpful in this scenario.

p.s. Oh and generate_results.R could be a function, too! :wink:

4 Likes

Now that is something I hadn't considered before and it makes sense :smiley: I have to put more thought into it, specially because my colleagues are R novices, but it seems promising.

I simplified things a bit - it's actually generate_results.Rmd, so yet another parametrized report :smiley: But your point hold true, it's just another wrapper function.

Cheers!

2 Likes

Hello adomingues, I was to recommend you to use Knit with Parameters, but the solution of grrrck seems better.
Good luck with your project.

1 Like

I am implementing this solution @grrrck and it does the trick for me - thank you. The only issue I am having is that passing output_dir to rmarkdown::render no results are saved the specified folder (saved tables). The html report itself is saved there though. Any idea of why this might be happening?

Cheers,
António

Fixed. I think I need to include the options:

    intermediates_dir = output_dir,
    knit_root_dir = output_dir

I am not 100% sure to be honest, but it's working now so :man_shrugging:

Is this results (saved tables) you are saving yourself in code chunks ?

output.dir will control where the output file is saved, and any resources created by knitr (like figure in plot chunks). However, it will not manage any output result you are writing yourself. For example, if you are using ggsave, you need to write correctly in the output folder you want.

About the option you tweaked, knit_root_dir will set the default working directory for which all the code chunks will be evaluated. Meaning getwd() in a chunk will be this value. This will impact input filepaths and output file paths. This could have some impact in your case if you are writing from chunk to the working directory (which by default is the root folder of the Rmd file)
intermediates_dir is the directory where the intermediates files required to be written during the rendering process are saved.

By default, eveything is done in the root folder of the Rmd file. Tweaking this with one of this argument will work but there could be some edge case. Sometimes, it can be interesting to fully manage the workflow by writing yourself and move the resulting file where you want.

I can help more if I understand better what is not working.

This is nice. I did not know about this feature.

Yeah and this is also something you can take advantage of with the package-ized version! Calling rmarkdown::knit_with_params(report_template) will start the same Shiny app for choosing parameters that you get with Knit with Parameters but will return the params list that you can hand to rmarkdown::render().

2 Likes

Thanks a lot for. This helped. I think I was now passing ouptut_dir correctly between my function and rmarkdown::render. Fixed now.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.