How to preserve hyperlinks in r htmltable while writing to a csv/excel file?

There are multiple hyperlinks in an htmltable. I want those hyperlinks to be preserved while writing to a csv file.

I tried converting to a dataframe but in both cases the csv has the links expanded.

I want to preserve the hyperlinks in data1$new in csv/excel file.

library(adapr)
library(htmlTable)
library(stringr)

data = data.frame("word"=c('python py', 'java'), 
                  "description"=c('Java is a statically typed and Python py is a dynamically typed',
                                  'java is a programming language'), stringsAsFactors = FALSE)
ll <- as.list(data$word)

data$new <- data$description
for(i in seq_len(nrow(data))) for(j in seq_along(ll)) {
  url <- paste0("http://www.google.com/",ll[[j]])
  url1 <- makeHyperlink(url,ll[[j]])
  url2 <- str_replace(url1,"file:///","")
  data$new[i] <- sub(ll[[j]], url2, data$new[i], ignore.case = TRUE)
}
data1<-htmlTable(data)

Here are a couple of ways I know of to preserve hyperlinks when writing to .csv/.xlsx

Hacky Method

Excel has a HYPERLINK function that has the format =HYPERLINK(LINK, TEXT) so we can create a column with exactly this text. Then when you open the .csv file with Excel, the formula gets parsed. Obvious downside is that this is an Excel-specific function (as far as I know) so any other .csv viewer will probably fail to render the link.

# Imports
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)
library(glue)
#> 
#> Attaching package: 'glue'
#> The following object is masked from 'package:dplyr':
#> 
#>     collapse

# Some mock data
test_data <- tribble(
  ~id, ~link,
  "Example link #1", "https://google.ca",
  "Example link #2", "https://community.rstudio.com"
) %>%
  mutate(
    link = glue(
      '=HYPERLINK("{link}", "{id}")'
    )
  )

write.csv(test_data, "test.csv", row.names = FALSE)

Created on 2019-08-19 by the reprex package (v0.3.0)

Output:

Using openxlsx

A cleaner solution is to use openxlsx. We can set the link column's class to hyperlink and openxlsx will respect that class when writing.

# Imports
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)
library(openxlsx)

# Some mock data
test_data <- tribble(
  ~id, ~link,
  "Example link #1", "https://google.ca",
  "Example link #2", "https://community.rstudio.com"
) 

class(test_data$link) <- "hyperlink"

write.xlsx(
  test_data,
  "test_out.xlsx"
)
#> Note: zip::zip() is deprecated, please use zip::zipr() instead

Created on 2019-08-19 by the reprex package (v0.3.0)

Output:
image

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.