Evaluating point forecast accuracy

Hi,

I am working my way through the Forecasting: Principles and Practice book: https://otexts.com/fpp3/

In section 5.8, the book describes how to evaluate point forecast accuracy. It says that you can only calculate this when applying the model to data which wasn't used when fitting the model.

However, the example given creates a training dataset which is a subset of the main dataset. The accuracy of the model is calculated using the main dataset although the main dataset contains data that was used to fit the model. I am confused as to why this is being done and why isn't separate train and test datasets being produced. Would it be best to train the model using a completely separate train dataset and then evaluate the model using a test dataset?

The code given by the book is as below:

library(fpp3)
recent_production <- aus_production %>% filter(year(Quarter) >= 1992)
beer_train <- recent_production %>% filter(year(Quarter) <= 2007)

beer_fit <- beer_train %>%
model(
Mean = MEAN(Beer),
Naïve = NAIVE(Beer),
Seasonal naïve = SNAIVE(Beer),
Drift = RW(Beer ~ drift())
)

beer_fc <- beer_fit %>%
forecast(h = 10)

accuracy(beer_fc, recent_production)


Referred here by Forecasting: Principles and Practice, by Rob J Hyndman and George Athanasopoulos

The accuracy() function will only use the elements of the data set that correspond to the test period. If only the test data is passed, the same error measures are obtained.

library(fpp3)

recent_production <- aus_production %>% filter(year(Quarter) >= 1992)
beer_train <- recent_production %>% filter(year(Quarter) <= 2007)
beer_test <- recent_production %>% filter(year(Quarter) > 2007)

beer_fit <- beer_train %>%
  model(
    Mean = MEAN(Beer),
    `Naïve` = NAIVE(Beer),
    `Seasonal naïve` = SNAIVE(Beer),
    Drift = RW(Beer ~ drift())
  )

beer_fc <- beer_fit %>%
  forecast(h = 10)

accuracy(beer_fc, recent_production)
#> # A tibble: 4 x 10
#>   .model         .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE    ACF1
#>   <chr>          <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
#> 1 Drift          Test  -54.0  64.9  58.9 -13.6  14.6  4.12  3.87  -0.0741
#> 2 Mean           Test  -13.8  38.4  34.8  -3.97  8.28 2.44  2.29  -0.0691
#> 3 Naïve          Test  -51.4  62.7  57.4 -13.0  14.2  4.01  3.74  -0.0691
#> 4 Seasonal naïve Test    5.2  14.3  13.4   1.15  3.17 0.937 0.853  0.132
accuracy(beer_fc, beer_test)
#> # A tibble: 4 x 10
#>   .model         .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE    ACF1
#>   <chr>          <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>   <dbl>
#> 1 Drift          Test  -54.0  64.9  58.9 -13.6  14.6    NaN   NaN -0.0741
#> 2 Mean           Test  -13.8  38.4  34.8  -3.97  8.28   NaN   NaN -0.0691
#> 3 Naïve          Test  -51.4  62.7  57.4 -13.0  14.2    NaN   NaN -0.0691
#> 4 Seasonal naïve Test    5.2  14.3  13.4   1.15  3.17   NaN   NaN  0.132

Created on 2020-11-24 by the reprex package (v0.3.0)

All columns are identical apart from MASE and RMSSE. In those two cases, the training data is needed to calculate the scaling factor, so when only the test data is included they cannot be computed.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.