My question relates to the forecasting performance of 1- to many-step-ahead forecasts. An example from the FPP3 textbook section on time series cross-validation is shown below (with minimal edits for compactness). The example relates to 1- to 8-step-ahead drift forecasts.
library(fpp3)
google_2015 <- gafa_stock %>%
filter(Symbol == "GOOG", year(Date) >= 2015) %>%
mutate(day = row_number()) %>%
update_tsibble(index = day, regular = TRUE) %>%
filter(year(Date) == 2015)
google_2015_tr <- google_2015 %>%
stretch_tsibble(.init = 3, .step = 1)
fc <- google_2015_tr %>%
model(RW(Close ~ drift())) %>%
forecast(h = 8) %>%
group_by(.id) %>%
mutate(h = row_number()) %>%
ungroup()
fc %>%
accuracy(google_2015) %>%
select(.model,.type,RMSE,MAE,MAPE,MASE)
However, the number of observations in google_2015
is 252. Shouldn't we be using .init=51
to ensure we use at least 20% of the total number of observations as training data?
Referred here by Forecasting: Principles and Practice, by Rob J Hyndman and George Athanasopoulos