Snowfall and Tensorflow

I am trying to distribute my MCS simulation on the university computer cluster. I am using r-tensorflow and snowfall.

When I distribute across the cores of a single node with the following code:

sfInit(parallel = TRUE, cpus = 15)

everything works fine.

But if I try to distribute across multiple nodes with the following code:

pbsnodefile = Sys.getenv("PBS_NODEFILE")
machines <- scan(pbsnodefile, what="")
nmach = length(machines)

sfInit(parallel=TRUE,type='SOCK',cpus=nmach,socketHosts=machines)

Snowfall fails to load the keras packages (tensorflow too), but it works fine with any other "normal" package. In this case I receive the following error:

Error in sfLibrary(keras) :
Stop: error loading library on slave(s): keras

Thank you for your help!!

Best

Tullio

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.