Can't use R keras fit_generator() with a custom data generator

This is my first post here so any help and/or advice on problem description is quite welcome.

That being said, let's head into the problem I've been facing for a couple of hours:

In order to train a vgg16 model, I'm using a custom R data generator to preprocess the data which comes from keras:flow_from_directory. And although I don't have nearly enough CPU processing power on my laptop, I was able to make it work with a reduced amount of batch_size, epochs, and steps_per_epoch. A simpler code to reproduce my success can be found here:
https://stackoverflow.com/questions/53357901/using-a-custom-r-generator-function-with-fit-generator-keras-.

But the problem starts once I get to my GPU-equipped computer and try using the fit_generator function with this custom R generator. I'm simply stuck at the first step of the first epoch without any response from R console whatsoever. And that happens to both my model and to the example model listed above. Here is what I get:

> library(keras)
> # example data
> data <- data.frame(
+   x = runif(80),
+   y = runif(80),
+   z = runif(80)
+ )
> # example generator
> data_generator <- function(data, x, y, batch_size) {
+   
+   # start iterator
+   i <- 1
+   
+   # return an iterator function
+   function() {
+
+     # reset iterator if already seen all data
+     if ((i + batch_size - 1) > nrow(data)) i <<- 1
+ 
+     # iterate current batch's rows
+     rows <- c(i:min(i + batch_size - 1, nrow(data)))
+     
+     # update to next iteration
+     i <<- i + batch_size
+     
+     # create container arrays
+     x_array <- array(0, dim = c(length(rows), length(x)))
+     y_array <- array(0, dim = c(length(rows), length(y)))
+     
+     # fill the container
+     x_array[1:length(rows), ] <- data[rows, x]
+     y_array[1:length(rows), ] <- data[rows, y]
+     
+     # return the batch
+     list(x_array, y_array)
+     
+   }
+   
+ }
> # set-up a generator
> gen <- data_generator(
+   data = data.matrix(data),
+   x = 1:2, # it is flexible, you can use the column numbers,
+   y = c("y", "z"), # or the column name
+   batch_size = 32
+ )
> # set up a simple keras model
> model <- keras_model_sequential() %>% 
+   layer_dense(32, input_shape = c(2)) %>% 
+   layer_dense(2)
2020-07-16 19:37:32.040393: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-16 19:37:35.098731: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-16 19:37:35.116724: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 970 computeCapability: 5.2
coreClock: 1.266GHz coreCount: 13 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 208.91GiB/s
2020-07-16 19:37:35.117090: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-16 19:37:35.124119: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-16 19:37:35.129961: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-16 19:37:35.132197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-16 19:37:35.137769: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-16 19:37:35.141461: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-16 19:37:35.153316: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-16 19:37:35.153789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-16 19:37:35.154322: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-07-16 19:37:35.164029: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x24a290c5750 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-16 19:37:35.164373: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-07-16 19:37:35.165056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 970 computeCapability: 5.2
coreClock: 1.266GHz coreCount: 13 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 208.91GiB/s
2020-07-16 19:37:35.165386: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-16 19:37:35.165896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-16 19:37:35.166327: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-16 19:37:35.166681: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-16 19:37:35.166922: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-16 19:37:35.167133: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-16 19:37:35.167332: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-16 19:37:35.167569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-16 19:37:35.680951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-16 19:37:35.681298: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0 
2020-07-16 19:37:35.681438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2020-07-16 19:37:35.681704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2991 MB memory) -> physical GPU (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0, compute capability: 5.2)
2020-07-16 19:37:35.684551: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x24a4bef6320 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-16 19:37:35.684840: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX 970, Compute Capability 5.2
> model %>% compile(
+   optimizer = "rmsprop",
+   loss = "mse"
+ )
> # fit using generator
> model %>% fit_generator(
+   generator = gen,
+   steps_per_epoch = 100, # will auto-reset after see all sample
+   epochs = 10,
+   max_queue_size = 50
+   
+ )
2020-07-16 19:37:48.296325: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
Epoch 1/10
  1/100 [..............................] - ETA: 0s - loss: 0.4254

As I've previously said, while running this same example code with a CPU installation of Keras in my laptop, it runs smoothly.

Has anyone ever faced a similar problem or at least know what could be causing this? If needed, I'm glad to provide more information in order to better clarify my issue.

Thanks in advance!

Additional info: I can normally train models using GPU installation of Tensorflow when I try a standard generator such as the one I get directly from flow_from_directory

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.