Error in model.frame.default: variable lengths differ.

Im new with R and I need to compare the accuracy of ARIMAX and ARIMA. This is a sample of my data and what I've done to do the ARIMA model:

library(dplyr)
library(forecast)
library(lubridate)
data<-tibble::tribble(
        ~id, ~day, ~month, ~year, ~value, ~reg1, ~reg2,
         1L,   1L,     1L, 2019L,  4.634, 0.626, 0.684,
         1L,   1L,     2L, 2019L,  2.969, 0.698, 0.049,
         1L,   1L,     3L, 2019L,  1.885,  0.62, 0.155,
         1L,   1L,     4L, 2019L,  2.415, 0.553, 0.959,
         1L,   1L,     5L, 2019L,  2.215, 0.598, 0.065,
         1L,   1L,     6L, 2019L,  1.805, 0.454,  0.07,
         1L,   1L,     7L, 2019L,  4.682, 0.045, 0.376,
         1L,   1L,     8L, 2019L,  4.248, 0.087, 0.094,
         1L,   1L,     9L, 2019L,   0.55, 0.523,  0.86,
         1L,   1L,    10L, 2019L,  0.109, 0.176, 0.591,
         2L,   1L,     1L, 2019L,  2.918, 0.442, 0.956,
         2L,   1L,     2L, 2019L,  3.083, 0.233, 0.388,
         2L,   1L,     3L, 2019L,  3.271, 0.652, 0.946,
         2L,   1L,     4L, 2019L,  2.175, 0.704, 0.902,
         2L,   1L,     5L, 2019L,   4.51, 0.851, 0.533,
         2L,   1L,     6L, 2019L,  4.178, 0.655, 0.614,
         2L,   1L,     7L, 2019L,  1.956, 0.434, 0.977,
         2L,   1L,     8L, 2019L,  3.219, 0.418,   0.4,
         2L,   1L,     9L, 2019L,   2.72, 0.335, 0.096,
         2L,   1L,    10L, 2019L,  4.519, 0.534, 0.388,
         3L,   1L,     1L, 2019L,  2.969, 0.707, 0.752,
         3L,   1L,     2L, 2019L,  2.456, 0.085, 0.651,
         3L,   1L,     3L, 2019L,  0.418, 0.851, 0.399,
         3L,   1L,     4L, 2019L,  2.324, 0.626, 0.317,
         3L,   1L,     5L, 2019L,  3.548, 0.175, 0.081,
         3L,   1L,     6L, 2019L,   3.74, 0.667, 0.691,
         3L,   1L,     7L, 2019L,   4.48, 0.853, 0.259,
         3L,   1L,     8L, 2019L,   0.18, 0.016, 0.489,
         3L,   1L,     9L, 2019L,  3.028,  0.51, 0.741,
         3L,   1L,    10L, 2019L,  4.652, 0.916, 0.953
        )
data<-data %>% 
  mutate(date=as.character(make_date(year,month,day)),YearMonth = tsibble::yearmonth((ymd(date)))) %>%
  as_tsibble(key=id,index = YearMonth)


fit <- data %>% 
  filter(YearMonth <= yearmonth("2019 Aug")) %>%
  model(ARIMA(value ~ PDQ(0,0,0), stepwise=FALSE, approximation=FALSE))

# Now forecast the test set and compute RMSE and MSE
fit %>%
  forecast(h = 2) %>%
  accuracy(data)

Now I need to do this but with an ARIMAX:

covariates <- c("reg1","reg2")
fit_arimax <- data %>% 
  filter(YearMonth <= yearmonth("2019 Aug")) %>%
  group_by(id) %>% 
  do(autoarima=auto.arima(.$value,xreg = as.matrix(data[,covariates])))

and I get the following error:

Error in model.frame.default(formula = x ~ xregg, drop.unused.levels = TRUE) : 
  variable lengths differ (found for 'xregg')  

In addition: Warning message: In !is.na(x) & !is.na(rowSums(xreg)) : longer object length is not a multiple of shorter object length

I saw this answer in stackoverflow but it didn't workt, maybe because I'm a beginner in R.

So I want to know if ARIMA from fable package has something to work with regressors or how to solve it with auto.arima. After that, I've to compare the accuracy by ID in ARIMA and ARIMAX. Does anyone know how to? thanks !

For the record, I've answered this question at https://stackoverflow.com/questions/62048082/arimax-with-grouped-data-r

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.