tsCV Function in R

Suppose I have the following data:

library(forecast)
library(lubridate)

set.seed(123)

weeks <- rep(seq(as.Date("2010-01-01"), as.Date("2023-01-01"), by = "week"), each = 1)
counts <- rpois(length(weeks), lambda = 50)
df <- data.frame(Week = as.character(weeks), Count = counts)

# Convert Week column to Date format
df$Week <- as.Date(df$Week)

# Create a time series object
ts_data <- ts(df$Count, frequency = 52, start = c(year(min(df$Week)), 1))

I am trying to learn how to use "Rolling Cross Validation" for Time Series Models (e.g. ARIMA) in R.

As I understand, this involves (ordering the data in chronological order):

  • Fit a model to the first 60 data points, predict the next 5 and record the error
  • Next, fit a model model to the first 65 points, predict the next 5 and record the error
  • etc.

Previously, I had tried to write the R code myself for implementing this procedure (Correctly Using the Window() Functions in R)- however, this appeared to be difficult and I am now interested in seeing if there are any ready-made implementations of such a procedure.

While looking on the internet, I found the following function that I think might be able to accomplish the desired task : R: Time series cross-validation

I tried to run this function on my data:

# note: I am specifically interested in using the auto.arima() function
far2 <- function(x, h){forecast(auto.arima(x), h=h)}
e <- tsCV(ts_data, far2, h=5)

The code seems to be running - But I am not sure if I am doing this correctly.

For example:

  • Is this code fitting a time series model on the first 5 points, predicts next 5, record error - then fits model on first 10 points, predicts next 5, record error, etc.?
  • From here, when the code finishes running - do I need to manually calculate the MAE and RMSE errors myself?

Can someone please comment on this?

Thanks!