Extracting help documentation as html _with_ links

I'm assembling a docset database for use with the Dash documentation browser for the tidyverse packages. This will be included with the Dash user contribs as soon as I'm done.

I'm almost done, but one thing is missing: The code I'm using to pull out the Rdoc in html isn't including any hyperlinks between pages, for example for the "See Also" sections.

Below is the code that I'm using to get the html docs. Linked functions are wrapped in <code> blocks, but there are no hyperlinks. Is there an alternate way to extract the html that preserves links?

My plan B is to extract the <code> blocks and check if they are valid functions, but that could be error-prone and the information must already exist somewhere.

library(tools)
library(tidyverse)
packages <- c("ggplot2","dplyr","tidyr","readr","tibble","stringr","forcats","purrr")

docs <- tibble( pkg=packages) %>% 
    mutate(
        documentation=map(pkg, Rd_db),
        topic=map(documentation, names),
        topic=map(topic, str_replace, "\\.Rd", "")
    ) %>% unnest(cols=c(topic, documentation))

capture_html <- function(rd) paste(capture.output(tools::Rd2HTML(rd)), collapse="\n")
docs <- docs %>% mutate( html=map_chr(documentation, capture_html))

docs %>% filter(topic=="geom_bar") %>% pull(html) %>% str_sub(1,2000) %>% cat
#> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>R: Bar charts</title>
#> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
#> <link rel="stylesheet" type="text/css" href="R.css" />
#> </head><body>
#> 
#> <table width="100%" summary="page for geom_bar"><tr><td>geom_bar</td><td style="text-align: right;">R Documentation</td></tr></table>
#> 
#> <h2>Bar charts</h2>
#> 
#> <h3>Description</h3>
#> 
#> <p>There are two types of bar charts: <code>geom_bar()</code> and <code>geom_col()</code>.
#> <code>geom_bar()</code> makes the height of the
#> bar proportional to the number of cases in each group (or if the
#> <code>weight</code> aesthetic is supplied, the sum of the weights). If you want the
#> heights of the bars to represent values in the data, use
#> <code>geom_col()</code> instead. <code>geom_bar()</code> uses <code>stat_count()</code> by
#> default: it counts the number of cases at each x position. <code>geom_col()</code>
#> uses <code>stat_identity()</code>: it leaves the data as is.
#> </p>
#> 
#> 
#> <h3>Usage</h3>
#> 
#> <pre>
#> geom_bar(
#>   mapping = NULL,
#>   data = NULL,
#>   stat = "count",
#>   position = "stack",
#>   ...,
#>   width = NULL,
#>   binwidth = NULL,
#>   na.rm = FALSE,
#>   orientation = NA,
#>   show.legend = NA,
#>   inherit.aes = TRUE
#> )
#> 
#> geom_col(
#>   mapping = NULL,
#>   data = NULL,
#>   position = "stack",
#>   ...,
#>   width = NULL,
#>   na.rm = FALSE,
#>   show.legend = NA,
#>   inherit.aes = TRUE
#> )
#> 
#> stat_count(
#>   mapping = NULL,
#>   data = NULL,
#>   geom = "bar",
#>   position = "stack",
#>   ...,
#>   width = NULL,
#>   na.rm = FALSE,
#>   orientation = NA,
#>   show.legend = NA,
#>   inherit.aes = TRUE
#> )
#> </pre>
#> 
#> 
#> <h3>Arguments</h3>
#> 
#> <table summary="R argblock">
#> <tr valign="top"><td><code>mapping</code></td>
#> <td>
#> <p>Set of aesthetic mappings created by <code>aes()</code> or
#> <code>aes_()</code>. If specified and <code>inherit.aes = TRUE</code> (the
#> default), it is combined with the default mapping at the top level of the
#> plot. You must supply <c

Created on 2020-03-24 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.