Arima flaat forecasts

I have a data as follows

Date    Paid
Jan-14  13392905
Feb-14  11939873
Mar-14  12473667
Apr-14  12237110
May-14  12579693
Jun-14  12030095
Jul-14  12052101
Aug-14  10205025
Sep-14  12102526
Oct-14  1237336
Nov-14  12148331
Dec-14  9842860
Jan-15  11990085
Feb-15  11061740
Mar-15  12076397
Apr-15  11702514
May-15  11395657
Jun-15  11817594
Jul-15  11643682
Aug-15  10243241
Sep-15  12233001
Oct-15  11769231
Nov-15  12652418
Dec-15  9774333
Jan-16  11888965
Feb-16  11892589
Mar-16  11419517
Apr-16  12143787
May-16  12330387
Jun-16  11929805
Jul-16  11583281
Aug-16  11995557
Sep-16  12646047
Oct-16  12677372
Nov-16  13301244
Dec-16  9915846

Using 2014-2015 information I want to generate forecasts until 2020.Hence, I have split the data into train & test

data.train<-window(mydata_ts,start=c(2014,1),end=c(2015,12))
data.test<-window(mydata_ts,start=c(2016,1))
auto.arima(data.train,trace=TRUE,test="kpss",ic="aic")

& following are the results:

  Best model: ARIMA(0,0,0)            with non-zero mean 

  Series: data.train 
  ARIMA(0,0,0) with non-zero mean 

  Coefficients:
        mean
  11275058.9
  s.e.    463612.8

  sigma^2 estimated as 5.381e+12:  log likelihood=-385.31
  AIC=774.62   AICc=775.19   BIC=776.98

& I get flat forecasts.I have tried using drift but that only helps when forecasting for 2016 & flattens 2017 onward. Is there something that can be done to overcome this.I have also tried the similar exercise in SAS using proc UCM & that seems to generate forecasts better than the auto.arima.

Can someone help out.

Thanks in advance.

Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

So according to Rob Hyndman (the author of the forecast package and book), the reason you are getting a flat forecast is because it is giving you a point forecast - which is essentially the mean of all predicted forecasts. There are a few likely reasons you are getting this outcome, and they can be seen in his blog post here. Here is what he says in that blog post:

  • A random walk model will return a flat forecast function (equal to the last observed value of the series).

  • An ETS(A,N,N) model will return a flat forecast function.

  • An iid model will return a flat forecast function (equal to the mean of the observed data).

In addition to this blog post, you can check out the Forecasting section in the Arima chapter of his book for a better explanation of how forecasts are generated.

In addition, your model is telling you something about your data. An arima model of the form arima(0,0,0) means that there are no significant autoregressive, moving average, or integration terms, i.e., your data is considered white noise, and your model essentially just consists of an intercept (the mean in this case) and an error term (the s.e. term in your coefficients).

2 Likes