pre_knit usage in rmarkdown YAML

I would like to transform an original Rmd file into another Rmd file before it is passed to knitr, so I'm attempting to use pre_knit listed in output_format(). Consider the following .Rmd input,

---
output: 
  pdf_document
---

```{r preformat, echo=FALSE}
cap = function(x){if(knitr::is_latex_output()) sprintf("\\textsc{%s}",x) else x}
```

## R Markdown

This is an R Markdown ~cap(document with formatting)~. 

I can process it with


pre_knit <- function(input, ...){

  rd <- readLines(input)
  id <- grep('~', rd)
  
  rd[id] <- stringr::str_replace_all(rd[id],
                  "~([:alnum:]+)\\((.*?)\\)~",
                  replacement = '`r \\1("\\2")`')
  
  writeLines(text = rd, con = paste0(input,"_tmp.Rmd"))
}


custom_format <- output_format(knitr = knitr_options(opts_chunk = list(dev = 'png')),
                               pandoc = pandoc_options(to = "html"),
                               pre_knit = pre_knit)

rmarkdown::render('pre.Rmd', output_format = custom_format)
rmarkdown::render('pre.Rmd_tmp.Rmd', output_format = pdf_document())

but ideally I would like to specify all this in the Rmd YAML header. I can't seem to make this work though; what would be the right place for this custom pre_knit hook in YAML? (and how to avoid an infinite loop by removing it for the next run?) Something like this,

---
output: 
  pdf_document:
    pre_knit: mypkg::custom_format
---

```{r preformat, echo=FALSE}
cap = function(x){if(knitr::is_latex_output()) sprintf("\\textsc{%s}",x) else x}
```

## R Markdown

This is an R Markdown ~cap(document with formatting)~. 

Side-note: currently the two ways I can think of to achieve this with arbitrarily-defined formatting tags would be:

  • pandoc filters
  • a valid R syntax, as in ``r cap("document with formatting")`, which is quite burdensome.

Hmm, I found some examples of the undocumented knit: hook, which could be used for this,

---
knit: (function(inputFile, encoding) {source('_pre_knit.R'); render(inputFile, output_format = custom_format, run_pandoc = FALSE); original = paste0(inputFile,"_tmp.Rmd"); rmarkdown::render(inputFile, encoding = encoding); file.copy(original, inputFile); unlink(original)})
output: 
  pdf_document
---

```{r preformat, echo=FALSE}
cap = function(x){if(knitr::is_latex_output()) 
  sprintf("\\textsc{%s}",x) else x}
```

This is an R Markdown ~cap(document with formatting)~. 

```{r reformat, echo=FALSE}
cap = function(x){if(knitr::is_latex_output()) 
  sprintf("\\textbf{%s}",x) else x}
```

## Section 2

This is new section with ~cap(bizarre formatting)~. 

with _pre_knitr.R containing the custom render functions (would eventually move to a package)

library(rmarkdown)
pre_knit <- function(input, ...){
  rd <- readLines(input)
  id <- grep('~', rd)
  
  rd[id] <- stringr::str_replace_all(rd[id],
                                     "~([:alnum:]+)\\((.*?)\\)~",
                                     replacement = '`r \\1("\\2")`')
  file.copy(input, paste0(input, 'original.Rmd'))
  writeLines(text = rd, con = input)
}


custom_format <- rmarkdown::output_format(knitr = rmarkdown::knitr_options(opts_chunk = list(dev = 'png')),
                               pandoc = rmarkdown::pandoc_options(to = "markdown"),
                               pre_knit = pre_knit)

pdf_format <- rmarkdown::output_format(knitr = rmarkdown::knitr_options(opts_chunk = list(dev = 'png')),
                                          pandoc = rmarkdown::pandoc_options(to = "latex"))

It kind of works, as a workaround; it might be tidier to modify the metadata between the two runs rather than create a temporary file. I'm still curious if there's a way to directly call pre_knit somewhere in the YAML.

1 Like

(for the curious, this is an attempt to borrow some of the markup concepts in Pollen)

I think one option could be that you create your own output format using rmarkdown::output_format(). This function as the pre_knit argument your are looking for. You can get inspired by how existing format in rmarkdown works or other package that offer new format.

1 Like

I guess the lack of documentation reflects the fact that knit: (and the elusive pre_knit: functionality I was after) isn't something one is expected to use. For what it's worth I think it is a workable strategy with the existing toolchain, i.e. in one knit click I get the desired transformation

but it feels very hackish. I've been reflecting on the whole Rmd toolchain, and in my view instead of pursuing this pre_knit tag further, with a setup like this

I would rather take a fresh look altogether at the full potential of a pandoc-centric toolchain,

pandoc filters aren't easy to get into (and the pandocfilters package is seriously lacking in documentation), but this feels like a robust and powerful approach, by-passing many current limitations in knitr and rmarkdown (because basically R could do literally anything between both ASTs, with full access to the document's structure – which node is code, text, header, etc.)

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.