forecast() function taking too long to give results for a hierarchical time series project


I am working on a data set for my hierarchical time series forecasting project which has 3,064,489 X 7 data. Using this data set, I have created a hierarchical tsibble object as per the following code:

data |>
  aggregate_key(Cluster/ County, Cases= sum(count))

The hierarchical time series has been created with below dimensions:
A tsibble: 3,074,348 x 4 [1D]
# Key: Cluster, County [3,118]

  • Date Cluster County Cases*
  • <int*> <int*> *

Using the model() function, I have created the below models (creating these models took almost 24 hours) with the following dimensions:
A mable: 3,118 x 9
# Key: Cluster, County [3,118]

  • Cluster County Mean Naive SNaive Drift Ets Arima Neural*
  • <int*> <int*> *

But when I am running the forecast function using all the above models simultaneously, even after 3 days its still running without any result.
I am not sure, if the time taken by forecast function is due to big chunk of data.

Please help me with mitigating the issue.


Referred here by Forecasting: Principles and Practice, by Rob J Hyndman and George Athanasopoulos

This is very likely the case. See the text at 12.7

Most time series models do not work well for very long time series. The problem is that real data do not come from the models we use. When the number of observations is not large (say up to about 200) the models often work well as an approximation to whatever process generated the data. But eventually we will have enough data that the difference between the true process and the model starts to become more obvious. An additional problem is that the optimisation of the parameters becomes more time consuming because of the number of observations involved.

Hi @technocrat,

Thanks for clarifying.
What I did is eventually run all the models separately. Except for Neural Network model, all the models did generate forecasts.
Neural network just kept running for more than 17 hours without any result.