@john01
I will try to summarize the whole range of opportunities to convert from Rmd to pdf using HTML and CSS.
Disclaimer: this is an opinionated post.
First of all, keep in mind how R Markdown works:
- the first step is to execute code chunks and transform the
Rmd file to a md file: this is the job of knitr
- the second step is to transform the
md file to a HTML file: this is the job of Pandoc (Pandoc is an external software)
In order to produce a pdf, you need an extra step: convert HTML (with CSS) to pdf. There are many tools to achieve this task. IMO, they belong in two categories:
- tier softwares: HTML/CSS to PDF engines
- in-browser solutions (i.e. JavaScript libraries)
HTML/CSS to PDF engines
Here's a short list of softwares I know: html2pdf, wkhtmltopdf, PhantomJS, weasyprint, PrinceXML, PDFreactor, Antennahouse, Oxygen PDF Chemistry...
FMPOV, the main differences between these rendering tools are:
- open source (
html2pdf, wkhtmltopdf, PhantomJS, weasyprint) vs. proprietary (PrinceXML, PDFreactor, Antennahouse, Oxygen PDF Chemistry)
- tools supporting the CSS Paged Media standard (
weasyprint, PrinceXML, PDFreactor, Antennahouse, Oxygen PDF Chemistry) vs. other tools.
The CSS Paged Media standard is composed of four different modules: CSS Paged Media Module Volume 3, CSS Generated Content for Paged Media Module, CSS Page Floats and CSS Fragmentation Module Level 3.
The CSS Paged Media Standard helps to transform HTML to a paged document (like XSL-FO for XML documents).
In-browser solutions
With you browser, you already can print to PDF.
In this category, you will find any client-side JavaScript libraries: jsPDF, html2canvas, Print.js, Vivliostyle.js, Paged.js...
FMPOV, the main differences are:
- tools building PDF (or images) in browser vs. tools enhancing the HTML content in order to control the PDF generated by the browser
- tools supporting the CSS Paged Media standard
Integration with R Markdown
Tier softwares
Pandoc has a native support for wkhtmltopdf, weasyprint and PrinceXML (you need to install these softwares). See https://pandoc.org/MANUAL#creating-a-pdf
As @cderv said, you can use them directly from R:
rmarkdown::pandoc_convert("file.html", output = "with_wkhtmltopdf.pdf", to = "html5")
rmarkdown::pandoc_convert("file.html", output = "with_weasyprint.pdf", to = "html5", options = c("--pdf-engine", "weasyprint"))
rmarkdown::pandoc_convert("file.html", output = "with_prince.pdf", to = "html5", options = c("--pdf-engine", "prince"))
I am opinionated about HTML/CSS to PDF converters and only consider tools that support CSS Paged Media standard (so, I don't use wkhtmltopdf). I developed the weasydoc package to ease my workflow with weasyprint and PrinceXML.
In-browser solutions
The main advantage of client-side libraries is that you can simply use your browser to create a PDF.
I will not develop the pros and cons of the different JavaScript libraries. In short, my recommendation is to use Paged.js (but feel free to test and use any other library).
Paged.js is used in the pagedown package and so, you have a native solution to create a PDF from R Markdown using HTML/CSS. I am much more comfortable with Paged.js than with any other tools or libraries, so I tend to recommend it.