# Accuracy of prediction intervals in fable

I have started using fable and I am wondering whether there is a function in fable to calculate the accuracy of prediction intervals for any given forecasting model? or we need to extract them and calculate it using a user defined function? sharing a reproducible example would be very helpful.

There are a few accuracy measures available in fabletools which allow you to evaluate the accuracy of intervals and distributions.

For intervals, the `winkler()` score is available.
For distributions, `percentile_score()` and `CRPS()` are available.

Explanations of how `winkler()` and `percentile_score()` are computed is available here: https://robjhyndman.com/papers/forecasting_state_of_the_art.pdf
There should be plenty of resources online to learn about continuous ranked probability scores (`CRPS()`).

Commonly used (and implemented) accuracy measures are organised into lists named `interval_accuracy_measures` and `distribution_accuracy_measures`, and I have used these below. However it is also possible to create your own list of accuracy measures to use.

``````library(tsibble)
library(fable)
library(dplyr)
us_deaths <- as_tsibble(USAccDeaths)
us_deaths %>%
# Withold a test set of one year
filter(index < yearmonth("1978 Jan")) %>%
# Model the training data
model(ETS(value)) %>%
# Forecast the test set
forecast(h = "1 year") %>%
# Compute interval/distribution accuracy
accuracy(us_deaths, measures = c(interval_accuracy_measures, distribution_accuracy_measures))
#> # A tibble: 1 x 5
#>   .model     .type winkler percentile  CRPS
#>   <chr>      <chr>   <dbl>      <dbl> <dbl>
#> 1 ETS(value) Test    2036.       91.6  181.
``````

Created on 2020-01-28 by the reprex package (v0.3.0)

2 Likes

Is this still valid for time series cross validation? If we fit a model to various rolling windows e.g.using stretch_tsibble, then can we still get the `winkler` , `percentile_score` and `CRPS`? If the answer is yes, how it is summarized across multiple rolling windows?

I don't know of any issues using these measures with cross validation.
You can summarise it in many ways, as the measures are averages you may consider taking the mean. The median is also reasonable, and often I look at and compare densities of accuracy measures.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.