Memory usage and R's global string pool

Is this memory causing an issue on the machines where you are running this?

It would help us help you if you used a reprex . ( https://www.tidyverse.org/help/ ) with embedded comments rather than pasting code directly with separate comments. reprex's are not only useful for asking questions they are useful in the development process because they make it easier to create reusable snippets of code to try things out.

Here is a reprex based on your example with embedded comments... it's a lot easier to see exactly when in the code measurements are being made and it makes it lot easier for us to duplicate what you are doing.

This was run on macOS that shows that removing big_char_df returns the memory it was using. The example starts off using 35MB of memory, then 468MB of memory of the building of big_char_df, then back to 69MB after big_char_df is removed... according to plyr::mem_used.

Because of layers of cache and virtual memory it might take a while before the OS realizes the memory is available... in fact it might not know that the memory is available until it tries to allocate some more memory. And of course it's also possible that R has not actually freed up the memory. That's why I asked of the memory usage of this example was causing problems for other applications.

NUM_LETTERS <- 10
NUM_ROWS <- 1e5
NUM_COLS <- 50

# initial memory usage
pryr::mem_used()
#> 34.6 MB
gc()
#>          used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 489139 26.2     940480 50.3   750400 40.1
#> Vcells 921423  7.1    1650153 12.6  1223402  9.4
big_char_df <- lapply(seq_len(NUM_COLS), function(c) {
    vapply(
        seq_len(NUM_ROWS),
        function(r) paste(sample(letters, NUM_LETTERS, TRUE), collapse = ""),
        ""
    )
})
names(big_char_df) <- paste0("var", seq_len(NUM_COLS))

big_char_df <- as.data.frame(big_char_df, stringsAsFactors = FALSE)
# after big_char_df memory usage
pryr::mem_used()
#> 468 MB
gc()
#>            used  (Mb) gc trigger  (Mb) max used  (Mb)
#> Ncells  5490142 293.3    8273852 441.9  6573678 351.1
#> Vcells 20052566 153.0   31186712 238.0 25922260 197.8
# after gc complete
pryr::mem_used()
#> 468 MB
# size big_char_df
format(object.size(big_char_df), units = "Mb")
#> [1] "305.2 Mb"
# pryr check of size of big_char_df
format(pryr::object_size(big_char_df), units = "Mb")
#> [1] "320005712"
rm(big_char_df)
# after big_char_df removed
pryr::mem_used()
#> 67.9 MB
gc()
#>           used (Mb) gc trigger  (Mb) max used  (Mb)
#> Ncells  490904 26.3    5295264 282.8  8273852 441.9
#> Vcells 5054091 38.6   19959495 152.3 29133986 222.3
# after 2nd gc
pryr::mem_used()
#> 67.9 MB