Changing forecasts in LSTM network

Maxim · October 20, 2020, 1:55pm

Hello. I have created an LSTM neural network for time series forecasting. The corresponding code is shown below:

library(keras)
library(tensorflow)
library(stats)
library(ggplot2)
library(readr)
library(dplyr)
library(forecast)
library(Metrics)
library(plotly)


library(ggplot2)
ggplot(data.frame(SP500logreturns),aes(x=1:length(SP500logreturns),y=SP500logreturns))+geom_line()


lagged<-as.data.frame(cbind(lag(SP500logreturns,1),SP500logreturns))
lagged[is.na(lagged)]<-0
colnames(lagged)<-c("x-1","x")
View(lagged)

#Splitting the data into training and testing datasets
N<-nrow(lagged)
n<-round(N*0.7,digits=0)
lagged_train<-lagged[1:n,]
lagged_test<-lagged[(n+1):N,]


#Scaling the data
scale_data = function(lagged_train, lagged_test, feature_range = c(0, 1)) {
  x = lagged_train
  fr_min = feature_range[1]
  fr_max = feature_range[2]
  std_train = ((x - min(x) ) / (max(x) - min(x)  ))
  std_test  = ((test - min(x) ) / (max(x) - min(x)  ))
  
  scaled_train = std_train *(fr_max -fr_min) + fr_min
  scaled_test = std_test *(fr_max -fr_min) + fr_min
  
  return( list(scaled_train = as.vector(scaled_train), scaled_test = as.vector(scaled_test) ,scaler= c(min =min(x), max = max(x))) )
  
}


Scaled = scale_data(train, test, c(-1, 1))

y_train = Scaled$scaled_train[, 2]
x_train = Scaled$scaled_train[, 1]

y_test = Scaled$scaled_test[, 2]
x_test = Scaled$scaled_test[, 1]


## inverse-transform
invert_scaling = function(scaled, scaler, feature_range = c(0, 1)){
  min = scaler[1]
  max = scaler[2]
  t = length(scaled)
  mins = feature_range[1]
  maxs = feature_range[2]
  inverted_dfs = numeric(t)
  
  for( i in 1:t){
    X = (scaled[i]- mins)/(maxs - mins)
    rawValues = X *(max - min) + min
    inverted_dfs[i] <- rawValues
  }
  return(inverted_dfs)
}

#Modelling the data
# Reshape the input to 3-dim
dim(x_train) <- c(length(x_train), 1, 1)

# specify required arguments
X_shape2 = dim(x_train)[2]
X_shape3 = dim(x_train)[3]
batch_size = 1                # must be a common factor of both the train and test samples
units = 1                     # can adjust this, in model tuninig phase

#=========================================================================================

model <- keras_model_sequential() 
model%>%
  layer_lstm(units, batch_input_shape = c(batch_size, X_shape2, X_shape3), stateful= TRUE)%>%
  layer_dense(units = 1)


#Compiling the model
model %>% compile(
  loss = 'mean_squared_error',
  optimizer = optimizer_adam( lr= 0.0093, decay = 0.0055 ),  
  metrics = c('accuracy')
)

#Summary
summary(model)


#Fitting the model to data
Epochs = 50   
for(i in 1:Epochs ){
  model %>% fit(x_train, y_train, epochs=1, batch_size=batch_size, verbose=1, shuffle=FALSE)
  model %>% reset_states()
}


#Forecasting the future values
L = length(x_test)
scaler = Scaled$scaler
predictions = numeric(L)

for(i in 1:L){
  X = x_test[i]
  dim(X) = c(1,1,1)
  yhat = model %>% predict(X, batch_size=batch_size)
  # invert scaling
  yhat = invert_scaling(yhat, scaler,  c(-1, 1))
  # invert differencing
  yhat  = yhat + SP500logreturns[(n+i)]
  # store
  predictions[i] <- yhat
  
}

However, everytime I run this code the forecast, represented by predictions is always different.

Could you,please, tell me, how to make predictions values one and the same each and everytime of running the code?

Thank you for your effort.

nirgrahamuk · October 20, 2020, 3:59pm

The way you are presenting this implies that you never rerun only the model prediction part, but always the model fitting again also. to stop the unpredictible randomisation from one model fit activity to the next, use a random seed before the model fit activity.

set.seed(42)

Maxim · October 21, 2020, 4:18pm

What does the argument of set.seed() depend on? In other words, why is it 42?

nirgrahamuk · October 21, 2020, 4:33pm

It can be any integer.
I like 42

GreyMerchant · October 21, 2020, 8:47pm

set.seed just forces the randomness to be the same randomness each run. There is no specified value set.seed should be however it helps to know which number you used so you can replicate in future.

system · November 11, 2020, 8:47pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.