I am trying to free memory and stop a process once memory becomes constrained on a FORK'ed cluster. However, even though the process is halted, I cannot seem to reclaim the memory. When I do it on a PSOCK cluster it works fine, but on a FORK'ed cluster, even when calling
rm(list = ls())
gc()
no memory gets freed. See below for an example. Memory usage can be viewed using htop on linux. I've also tried replacing stop() with mcexit() but that just causes more problems and doesn't seem to free up any memory.
I am running on Ubuntu 18.04 and R 3.6.1.
library(parallel)
noOfThreads <- detectCores() - 1
cluster <- makeCluster(noOfThreads, type = "FORK") # Remove type = "FORK" and this will work
clusterApplyLB(cluster, 1:noOfThreads, function(x) {
bigList <- list()
index <- 1
while(T) {
bigList[[index]] <- rep('x', 10000000)
index <- index + 1
cat(sprintf("%i: %i\n", x, index))
memFree <- as.numeric(system("awk '/MemFree/ {print $2}' /proc/meminfo", intern=TRUE))
memTotal <- as.numeric(system("awk '/MemTotal/ {print $2}' /proc/meminfo", intern=TRUE))
if (memFree / memTotal < 0.5) {
rm(list = ls())
gc()
stop("Out of Memory")
}
}
})
# stopCluster(Cluster) # Will free up the memory
I have not done this kind of parallelisation, but installr has some utilities for killing processes, like
kill_all_Rscript_s()
I actually don't think manually calling gc will do much for you, it's probably doing a pretty good job already. And I wouldn't use rm(list=ls()), just focus on the variables that are causing problems.
This is a bit of a puzzle. The reason the gc wouldn't be freeing memory is if it's not running or there's lingering reference to the object.
Allocate static amount of memory for testing
Using vector to assign chunks of memory might be more efficient for testing. At least it makes it easier for me to sort out how much memory I can expect to be used.