CRASH : ENTIRE pc freezes when rmarkdown::render is called in loop (...upon 6th iteration)

rmarkdown

#1

…This is a repost from RStudio Support, because as far as I understand the board will be migrated, therefore I apologise for the double posting, but would none the less extremely appreciate any help here…

I have 13 reports that I wish to produce based on a template .rmd file. To achieve this I’ve constructed a loop with a counter grabbing specific aspects of a dataframe to populate the template with for each report.

I can succesfully loop through the code 5 times and html files are created, but on the 6th iteration, my entire machine freezes and requires a hard reboot for me to regain control.

I’ve envoked the following commands to attempt to “solve” the problem at each iteration of the loop, but this does not seem to work :

Sys.unsetenv("LC_ALL")

Sys.unsetenv("LANG")

I’ve read that people are experiencing problems with pandoc, but their symptoms don’t quite agree with mine, since I loose complete control over my pc - not merely my RStudio IDE (everything freezes).

I’ve run this process with a “brand, new, clean” .rmd file (as template), placing it in a loop and calling it with the following command :

i = 1
while(i <= 13){

rmarkdown::render("d:/template.rmd", output_format = c("html_document"), output_file = "d:/out.html")

i = i + 1

}

…and my entire pc crashes on the 6th iteration. The current while loop is used with different criteria in my actual program, but this illustrates my basic script layout.


sessionInfo():

R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_South Africa.1252 LC_CTYPE=English_South Africa.1252 LC_MONETARY=English_South Africa.1252 LC_NUMERIC=C
[5] LC_TIME=English_South Africa.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] kableExtra_0.7.0 sqldf_0.4-11 RSQLite_2.0 gsubfn_0.6-6 proto_1.0.0 RPostgreSQL_0.6-2 DBI_0.7 markdown_0.8 knitr_1.19
[10] rmarkdown_1.8

loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 xml2_1.1.1 magrittr_1.5 hms_0.4.1 munsell_0.4.3 rvest_0.3.2 bit_1.1-12 viridisLite_0.2.0 colorspace_1.3-2
[10] R6_2.2.2 rlang_0.1.4 plyr_1.8.4 httr_1.3.1 stringr_1.2.0 blob_1.1.0 tcltk_3.4.2 tools_3.4.2 htmltools_0.3.6
[19] yaml_2.1.16 bit64_0.9-7 rprojroot_1.3-2 digest_0.6.12 tibble_1.3.4 readr_1.1.1 memoise_1.1.0 evaluate_0.10.1 stringi_1.1.6
[28] compiler_3.4.2 scales_0.5.0 backports_1.1.1 lubridate_1.7.1 chron_2.3-51 pkgconfig_2.0.1


#2

Entire PC freeze-up / crash sounds like your CPUs are consumed, you are out of memory, or (least likely) you are out of disk space. Have you popped up some performance monitoring while you run the task? The other alternative is to just start with the 6th iteration and see if there is something wrong with that iteration/dataset. What happens in those cases?


#3

Thanks for the response Cole,

I’ve played around (even before my original post) - I can execute the 3 render statement within the first 3 cases in the loop (without a crash) and then run it again over exactly the same 3 cases (which previously ran and produced output) and on the 6’th iteration the pc will freeze up again. Also as per your suggestion - prior to my original post I’ve run the 6th iteration on its own to establish any programmatic error / explanation (on my behalf).

Also, I can confirm that I’m not out of disk space.

When running gc() before and after the render statement the crash still occurs on the 6th iteration (starting anywhere and again rerunning existing parts of the loop that worked fine previously). However, i obtain the following verbose feedback (it gets interesting at the end):

(explicit gc before 1st render)
Garbage collection 35 = 24+5+6 (level 0) …
44.1 Mbytes of cons cells used (71%)
9.5 Mbytes of vectors used (60%)

(render & verbose feedback - this feedback is displayed through the render of the 1st iteration and doesn’t change much in terms of values)
processing file: Eastbrooke_Account_Statements.Rmd
Garbage collection 37 = 25+5+7 (level 0) …
45.1 Mbytes of cons cells used (58%)
9.6 Mbytes of vectors used (61%)

(on second iteration - explicit gc)
Garbage collection 53 = 39+5+9 (level 0) …
48.9 Mbytes of cons cells used (63%)
10.4 Mbytes of vectors used (53%)

(render & verbose feedback - this feedback is displayed through the render of the 2nd iteration and increases slightly, but doesn’t change much in terms of values)
Garbage collection 60 = 45+6+9 (level 0) …
49.9 Mbytes of cons cells used (65%)
10.8 Mbytes of vectors used (55%)

(on third iteration - explicit gc)
Garbage collection 72 = 53+6+13 (level 0) …
49.4 Mbytes of cons cells used (64%)
11.0 Mbytes of vectors used (56%)

(on fourth iteration - explicit gc)
Garbage collection 76 = 55+6+15 (level 0) …
49.6 Mbytes of cons cells used (64%)
12.8 Mbytes of vectors used (53%)

(render & verbose feedback - this feedback is displayed through the render of the 4th iteration and suddenly changed relative to the previous iterations)
Garbage collection 77 = 55+6+16 (level 2) …
49.6 Mbytes of cons cells used (64%)
12.7 Mbytes of vectors used (28%)
Garbage collection 78 = 55+6+17 (level 2) …
49.5 Mbytes of cons cells used (64%)
35.5 Mbytes of vectors used (49%)

(on 5th iteration - explicit gc)
Garbage collection 81 = 56+6+19 (level 0) …
49.6 Mbytes of cons cells used (64%)
33.7 Mbytes of vectors used (47%)

** verbose feedback at the end of the 5 iteration - notice how the Mb of vectors used spiked through the roof **
Garbage collection 82 = 56+6+20 (level 2) …
49.6 Mbytes of cons cells used (64%)
33.6 Mbytes of vectors used (9%)
Garbage collection 83 = 56+6+21 (level 2) …
49.5 Mbytes of cons cells used (64%)
307.0 Mbytes of vectors used (44%)

gc on 6th iteration at which point pc freezes
Garbage collection 101 = 66+8+27 (level2) …
66.2 Mbytes of cons cells used (57%)
3569.9 Mbytes of vectors used (43%)

  1. Is there perhaps an alternative to gc() that I can use to clear the cache?
  2. Furthermore, this “spike” in memory occurs even if I run a “clean, standard” report with no calculations / data (therefore I’m convinced that this does not have anything to do with the specific report / data, etc)
  3. Is there any other memory tracking software that you can advise me to use to see if the problem is in R / pandoc?

#4

Hi Cole, any thoughts on my extended description of the problem? Best Regards


#5

Hey, yes. By the way, referring to people with @cjdw007 or by replying to their message ensures that they get a notification about your reply :slight_smile:

In any case, gc() itself is unlikely to be helpful to your cause because R is pretty active on garbage collecting anyways. It sounds like you have actively teased out any correlation in which report you are running and which causes the error - that it is the 6th regardless? Is there any chance you can provide a minimal reproducible example that I could try on my own system? Especially since a “clean, standard” report seems to cause issue?

My only other thoughts on monitoring would be profvis. You probably want to be sure that it is writing to a file in case / when the crash happens. Besides that you can just open the task manager and watch what happens when the 6th iteration hits. You could always try running the R script from a terminal with RScript blah.R too in order to tease out the IDE, just in case.