Getting NULL Models for ARIMA when forecasting with fable and tsibble

I am getting when I use ARIMA but if I fix the parameters like d = 1 then it run successfully and give model. I tried ETS model and it also worked as expected. I am running this in databricks DBR 8.0 with following R version -
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 4
minor 0.3
year 2020
month 10
day 10
svn rev 79318
language R
version.string R version 4.0.3 (2020-10-10)
nickname Bunny-Wunnies Freak Out

However if I run the same in win10 system then it produces expected results.

Following is the code

library(tidyverse)
library(data.table)
library(lubridate)
library(urca)
library(fable)
library(tsibble)

df_model4_impute <- 
  df_model4 %>%
  as_tsibble(index= cal_dt, key = material) %>%
  group_by_key()

df_model4_impute <- 
  df_model4_impute %>%
  fill_gaps() %>%
  fill(observed_cleaned, .direction = "down")

fit <- 
    df_model4_impute %>%
    filter(ZDEPMENT == "F22") %>%
    model(
      # ARIMA = ARIMA(observed_cleaned ~ pdq(d=1) + PDQ(D=1))
      ARIMA_def = ARIMA(observed_cleaned)
      # ETS = ETS(observed_cleaned)
      )

Results -
material              ARIMA_def
   <chr>                   <model>
 1 000000000020000005  <NULL model>
 2 000000000020001325  <NULL model>

But when I run in Win10 system I get expected results.

Can someone please help?

Please provide a reproducible example of the problem. That is, we need access to your data to see what is going on.

I have created a dummy code here for reproducing the case -

# CREATE DATA SET FOR REPRODUCING THE SCENARIO

databrics_dummy <- data.frame(cal_dt = seq(as.Date("2020-01-01"), as.Date("2021-03-31"), by = "day"),
                                 department = rep("F22", 456),
                                 observed_cleaned = seq(20, 60, length = 456))

# creating group key & index
databrics_dummy_ts <- 
  databrics_dummy %>%
  as_tsibble(index= cal_dt, key = department) %>%
  group_by_key()

# impute missing values 
databrics_dummy_ts <- 
  databrics_dummy_ts %>%
  fill_gaps() %>%
  fill(observed_cleaned, .direction = "down")

# run model

fit_db <-
  databrics_dummy_ts %>%
  model(
  arima = ARIMA(observed_cleaned),
  arima_def = ARIMA(observed_cleaned ~ pdq(d=1) + PDQ(D=1)),
  ets = ETS(observed_cleaned)
  )

# Results 

# A mable: 1 x 4
# Key:     department [1]
  department        arima                arima_def          ets
  <chr>           <model>                  <model>      <model>
1 F22        <NULL model> <ARIMA(1,1,1)(0,1,1)[7]> <ETS(M,A,M)>

It produces NULL model for auto arima. I am using databricks platform (run it in 8.0 runtime environment).

Environment detail -

platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 4
minor 0.3
year 2020
month 10
day 10
svn rev 79318
language R
version.string R version 4.0.3 (2020-10-10)
nickname Bunny-Wunnies Freak Out

@robjhyndman can you please help me here? Or should I switch to forecast package?

I am unable to reproduce this issue:

library(fpp3)
#> ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
#> ✓ tibble      3.1.2          ✓ tsibble     1.0.1     
#> ✓ dplyr       1.0.6          ✓ tsibbledata 0.3.0     
#> ✓ tidyr       1.1.3          ✓ feasts      0.2.1.9000
#> ✓ lubridate   1.7.10         ✓ fable       0.3.1     
#> ✓ ggplot2     3.3.3.9000
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> x lubridate::date()    masks base::date()
#> x dplyr::filter()      masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval()  masks lubridate::interval()
#> x dplyr::lag()         masks stats::lag()
#> x tsibble::setdiff()   masks base::setdiff()
#> x tsibble::union()     masks base::union()

databrics_dummy <- data.frame(cal_dt = seq(as.Date("2020-01-01"), as.Date("2021-03-31"), by = "day"),
                              department = rep("F22", 456),
                              observed_cleaned = seq(20, 60, length = 456))

# creating group key & index
databrics_dummy_ts <- 
  databrics_dummy %>%
  as_tsibble(index= cal_dt, key = department) %>%
  group_by_key()

# impute missing values 
databrics_dummy_ts <- 
  databrics_dummy_ts %>%
  fill_gaps() %>%
  fill(observed_cleaned, .direction = "down")

# run model

fit_db <-
  databrics_dummy_ts %>%
  model(
    arima = ARIMA(observed_cleaned),
    arima_def = ARIMA(observed_cleaned ~ pdq(d=1) + PDQ(D=1)),
    ets = ETS(observed_cleaned)
  )
#> Warning in sqrt(diag(best$var.coef)): NaNs produced
fit_db
#> # A mable: 1 x 4
#> # Key:     department [1]
#>   department          arima                arima_def          ets
#>   <chr>             <model>                  <model>      <model>
#> 1 F22        <ARIMA(0,1,4)> <ARIMA(1,1,1)(0,1,1)[7]> <ETS(M,A,M)>

Created on 2021-07-15 by the reprex package (v2.0.0)

I am very surprised. This time I am also getting result but only change that i made is attached fpp3 library instead fable. However I tried some more models and I found it is coming as NULL. Attached code is here


fit_db <-
  databrics_dummy_ts %>%
  model(
  arima_auto = ARIMA(observed_cleaned),
  arima_def = ARIMA(observed_cleaned ~ pdq(d=1) + PDQ(D=1)),
  arima_custom_1 = ARIMA(observed_cleaned ~ pdq(4,1,3) + PDQ(0,1,0)), # NULL Model
  arima_custom_2 = ARIMA(observed_cleaned ~ pdq(4,1,4) + PDQ(2,0,0)), # NULL Model
  arima_custom_3 = ARIMA(observed_cleaned ~ pdq(0,1,3) + PDQ(0,1,1)), # NULL Model
  ets = ETS(observed_cleaned)
  )

@mitchelloharawild can you please help me here? custom_1, custom_2 and custom_3 produces the NULL model. But when I am adding constant in these models it is producing models. Also I am using the same environment (linux based) as mentioned above.
In case you face issue in reproducibility, can you please guide how can I create reproducible code?

The models you are attempting to estimate are too complicated for the automatic selection algorithm's default constraint of order_constraint = p + q + P + Q <= 6 & (constant + d + D <= 2). Hence why you are getting the error of There are no ARIMA models to choose from after imposing the 'order_constraint', please consider allowing more models.. This is indicated in the given warning message:

It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
You can include a constant using ARIMA(y~1) to the formula or exclude it by adding ARIMA(y~0).

Your custom_3 model produces estimation errors in the stats::arima() function.


> arima(databrics_dummy_ts$observed_cleaned, order = c(0,1,3), seasonal = list(order = c(0,1,1), period = 7))

Call:
arima(x = databrics_dummy_ts$observed_cleaned, order = c(0, 1, 3), seasonal = list(order = c(0, 
    1, 1), period = 7))

Coefficients:
      ma1  ma2  ma3  sma1
        0    0    0     0
s.e.    0    0    0     0

sigma^2 estimated as 2.355e-29:  log likelihood = 14130.01,  aic = -28250.03

Thanks @mitchelloharawild !
I got it now.
I want to understand few more things. I am training a model with fourier term. Using fable interface, how can I add fourier term in forecast function when forecasting for future.

Also, when I am saving the model in .RDS format I am not able to retrieve it. Is there other way to save models?

Including fourier terms as an exogenous regressor can be done using fourier(period = "year", K = num_harmonics). When forecasting a model with exogenous regressors, the fourier terms into the future will automatically be produced.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.