pagedown::chrome_print unable to produce large batches of parameterized reports

Hi all,
I am trying to render ~3000 parameterized reports through pagedown::chrome_print.

I usually run a few samples to make sure that things are rendering correctly.

The initial problem:

  • somewhere in the region of 170 ~ 210 files render perfectly, then RStudio loses its connection

The initial attempted solution:

  • run the script from the command-line (macOS Catalina 10.15.7)

Result:

  • either an endless stall, or errors for every file after number 200ish.

A bit of googling made me think that it was to do with the default number of files that can be open at once(?) (https://github.com/jupyterlab/jupyterlab/issues/6727)
ulimit -n
which is indeed 256
Changing it to 10000 in the same terminal window, immediately before running the script did not affect the result - still 170 ~ 210 files and then either a stall forever part-way through, or errors from then on out (letting me save what the errors were in an RDS file)

Here is my best attempt at a reprex:

reprex.Rmd:

---
title: "reprex"
date: "12/4/2020"
output:
  pagedown::html_paged:
    toc: false
    number_sections: false
params:
  a: "a"
  b: 1
knit: pagedown::chrome_print
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

Parameter "a" is: `r params$a`

Parameter "b" is: `r params$b`

render.R:

# make sure the ulimit is high enough
system(command = "ulimit -n 10000")

# 400 sets of parameters
parameters <- tidyr::expand_grid(a = c("a", "b", "c", "d"),  b = runif(100))

# make the parameters compatible with YAML 
# side note - is there a better way to do this?
params <- as.list(rlist::list.parse(parameters))

# prep the outpaths
args <- list(a = parameters$a, b = parameters$b)
paths<-purrr::pmap(args, ~c(paste0("./", ..1, "/", ..1, "_", ..2, ".pdf")))

# keep output/errors
safe_render <- purrr::safely(pagedown::chrome_print)

# render
output <- purrr::map2(params, paths, 
               ~safe_render(
                 rmarkdown::render("reprex.rmd", params = .x, envir = new.env()), output = .y))

# make errors etc available after running from the command line
saveRDS(output, "output.RDS")

Result:

  • Command-line
    processing file: reprex.rmd
    ...
    ...
    Output created: reprex.html

200ish times, then

エラー: 予想外の ',' です in "ms = .x," 実行が停止されました
(unexpected "," in "ms = .x", action terminated)

  • troubleshooting post-hoc
    (sorry, I am still pretty bad with working with lists)
``` {r}
library(magrittr)
tibble::tibble(output = readRDS("output.RDS")) %>% 
  tidyr::unnest_wider(output) %>% 
  dplyr::count(error) 

```

> ||||
> | --- | --- | --- |
> |1|NULL|200|
> |2|list(message = "Failed to generate output. Reason: Failed to open http://127.0.0.1:5306/reprex.html (HTTP status code: 500)", call = force(expr))|1|
> |3|list(message = **"Cannot create pipe when running '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome' (system error 24, Too many open files)** @unix/processx.c:455 (processx_exec)", call = rethrow_call(c_processx_exec, command, c(command, args), stdin, stdout, stderr, pty, pty_options, connections, env, windows_verbatim_args, windows_hide_window, private, cleanup, wd, encoding, paste0("PROCESSX_", private$tree_id, "=YES")), `_nframe` = 16, `_ignore` = list(c(17, 21)))|4|
> |4|list(**message = "cannot make processx socketpair (system error 24, Too many open files) @unix/processx.c:408 (processx__make_socketpair)**", call = rethrow_call(c_processx_connection_create_pipepair, encoding, nonblocking), `_nframe` = 17, `_ignore` = list(c(18, 22)))|195|

The parts of the error messages that I think are pertinent are in bold.

I have had success rendering batches this size and greater in the past with flexdashboard > png > imagemagick to convert to pdf (valueboxes don't render correctly when going directly to pdf) using an almost identical workflow, so I am trying to figure out what is going on. I imagine it must be something to do with headless chrome?

I have been explicitly requested printable, paged, pretty pdfs for this one, so worst case scenario, I split it all up and render 100 reports at a time, but I don't want to waste an entire day doing so.

edit session info for reprex
error/result identical for office computer running R 4.0.3, otherwise platform/OS version the same

sessionInfo()
> R version 4.0.2 (2020-06-22)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Catalina 10.15.7
> 
> Matrix products: default
> BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods  
> [7] base     
> 
> loaded via a namespace (and not attached):
>  [1] rstudioapi_0.13   knitr_1.30        magrittr_2.0.1   
>  [4] tidyselect_1.1.0  R6_2.5.0          rlang_0.4.8      
>  [7] fansi_0.4.1       dplyr_1.0.2       tools_4.0.2      
> [10] data.table_1.13.2 xfun_0.19         cli_2.2.0        
> [13] htmltools_0.5.0   ellipsis_0.3.1    yaml_2.2.1       
> [16] assertthat_0.2.1  digest_0.6.27     tibble_3.0.4     
> [19] lifecycle_0.2.0   crayon_1.3.4      purrr_0.3.4      
> [22] tidyr_1.1.2       vctrs_0.3.5       rlist_0.4.6.1    
> [25] glue_1.4.2        evaluate_0.14     rmarkdown_2.5    
> [28] compiler_4.0.2    pillar_1.4.7      generics_0.1.0   
> [31] pagedown_0.12     pkgconfig_2.0.3  

~~~~~~~~~

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.