tidyverse:::fable bagged models

Regarding to:

Most of the required code is in the book. The only thing left to do is to turn the bagged forecast into a fable object. If you're only interested in point forecasts, we can just create a degenerate distribution based on the mean. To do full distributional forecasting, we would need to create a mixture distribution. Either way, once we have a fable, the accuracy function can be applied. Here is an example using a degenerate distribution based on the mean.

library(fpp3)
#> ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──
#> ✓ tibble      3.1.4      ✓ tsibble     1.0.1 
#> ✓ dplyr       1.0.7      ✓ tsibbledata 0.3.0 
#> ✓ tidyr       1.1.3      ✓ feasts      0.2.2 
#> ✓ lubridate   1.7.10     ✓ fable       0.3.1 
#> ✓ ggplot2     3.3.5
#> ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
#> x lubridate::date()    masks base::date()
#> x dplyr::filter()      masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval()  masks lubridate::interval()
#> x dplyr::lag()         masks stats::lag()
#> x tsibble::setdiff()   masks base::setdiff()
#> x tsibble::union()     masks base::union()
library(distributional)

# Training data
cement <- aus_production %>%
  filter(year(Quarter) >= 1988, year(Quarter) <= 2008) %>%
  select(Quarter, Cement)
# ETS forecasts
ets_forecasts <- cement %>%
  model(ets = ETS(Cement)) %>%
  forecast(h = 6) %>%
  select(-.model)
# Bagged forecasts (averaged over 100 replicates)
bagged_forecasts <- cement %>%
  model(stl = STL(Cement)) %>%
  generate(new_data = cement, times = 100, bootstrap_block_size = 8) %>%
  select(-.model, -Cement) %>%
  model(ets = ETS(.sim)) %>%
  forecast(h = 6) %>%
  summarise(bagged_mean = mean(.mean)) %>%
  mutate(dist = dist_degenerate(bagged_mean)) %>%
  as_fable(response = "Cement", distribution = dist)
#> Warning: The dimnames of the fable's distribution are missing and have been set
#> to match the response variables.
# Compare accuracy
bind_rows(
    ets_forecasts %>% accuracy(aus_production) %>% mutate(method = "ETS"),
    bagged_forecasts %>% accuracy(aus_production) %>% mutate(method = "Bagged ETS")
  ) %>%
  select(method, everything())
#> # A tibble: 2 × 10
#>   method     .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE   ACF1
#>   <chr>      <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl>
#> 1 ETS        Test  -157.  178.  161.  -7.55  7.71  1.63  1.38 -0.327
#> 2 Bagged ETS Test  -222.  242.  222. -10.7  10.7   2.25  1.87 -0.403

How would this work with new_data=myNewData, instead of h=6?
Since it would yield: 'Provided data contains a different key structure to the models.'

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.