Keras prediction differs numerically for last n%%4 instances

Is this the right place for tensorflow bugs?

Predictions may differ slightly depending on position and input length. More specifically, if n = NROW(x), row i will get one value if 0 < i <= bitwAnd(n, -4), but a possibly different value for bitwAnd(n, -4) < i <= n. (These are FORTRAN style 1-based indexes. For C++ / Python 0-based, switch "<=" and "<".) I write "may" as it happens to me with probability around 0.4. "Slightly" means in the order of the least significant bits of the float32 mantissa.

Here's a reproducible example:

options(digits=8)
fake <- function(shape_) {                        # arbitrary but reproducible
   array(seq_len(prod(shape_)) %% 2.71 - 1.04, shape_)
}

library(keras)
shape <- c(30,5)
model <- keras_model_sequential() %>%
   layer_lstm(units=2, input_shape=shape) %>%
   set_weights(list(fake(c(5, 8)), fake(c(2, 8)), fake(8)))

n <- 11                                           # not a multiple of 4
x <- array(rep(fake(shape), each=n), c(n, shape)) # n copies of identical input
p <- model %>% predict(x)                         # all predictions should match
p                                                 # but last n%%4 rows differ
#>             [,1]        [,2]
#>  [1,] 0.46561426 -0.22865930
#>  [2,] 0.46561426 -0.22865930
#>  [3,] 0.46561426 -0.22865930
#>  [4,] 0.46561426 -0.22865930
#>  [5,] 0.46561426 -0.22865930
#>  [6,] 0.46561426 -0.22865930
#>  [7,] 0.46561426 -0.22865930
#>  [8,] 0.46561426 -0.22865930
#>  [9,] 0.46561423 -0.22865926
#> [10,] 0.46561423 -0.22865926
#> [11,] 0.46561423 -0.22865926
(t(p)-p[1,]) * 2**26                              # the difference is low bits
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#> [1,]    0    0    0    0    0    0    0    0   -2    -2    -2
#> [2,]    0    0    0    0    0    0    0    0    3     3     3

Created on 2019-07-22 by the reprex package (v0.3.0)

One odd consequence is that may get one result in row 25 when there were 28 rows total but a different result when there are 27 total (so 25 was in the last 27%%4 = 3), without otherwise changing the input data.

Some more background, regarding to our application to time series:

If predicting n days, the first n-1 predictions should match what
you get if you had 1 less data of data and were generating just n-1
predictions. This property is useful in testing and checking to
verify it catches bugs where a back-test accidentally snoops into
future data.

keras (or TensorFlow) has a bug causing the final 1, 2 or 3 predictions to
change as you stack on additional data, even though no change is made
to the historical data! The change to the predictions is small in floating
point, but it should be exactly zero, and any numerical however small
will interfere with catching future snooping bugs (which also can
have small/disguised impact).

Above, I provided a small, self-contained example illustrating this issue.
In it, every day should get the same prediction. But the final few days vary.

A workaround for this: pad your input data so its length is a
multiple of four (adding 0/1/2/3 rows), generate predictions, then
remove the final 0/1/2/3 corresponding to the padding. Now your
predictions are stable and do not change as you add more days.

BTW, a week ago at github.com/tensorflow/tensorflow/issues/30995,
qlzh727 said, "Let me take a look".

Since this bug is directly y related to the tensorflow package and not with the R wrapper, I'm afraid there is not much we can do for you, the best place for this would be the github issue you already opened.

Great, thank you for the reply. I started with the R package "keras" but began suspecting the trouble lied deeper.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.