BSTS Package: Error in the Mean Absolute Percentage Estimate (MAPE) as Inf % in a Bayesian Inference with MCMC Plot using ggplot() in R

Overview:

I am conducting a Bayesian time series analysis with mcmc simulations. To visualise the mean absolute percentage error (MAPE) for the analysis, I am producing a plot using ggplot().

I am following this tutorial below:

Problems:

In the past, I have successfully produced this plot (see plot 1 below) with my R-code that can be found in this previously asked Stack Overflow question.. The mean holdout value was 32.52 %, and the actual and fitted data plotted nicely (plot 1).

However, I have not used this code for a couple of months, and today, I needed to extract the plot for work. To my dismay, the R-code that was previously working absolutely fine is no longer working (see plot 2). I have tried to adapt the R-code below in the data frame d2 (see R-code below) as well as checking the R-code in the plotting object itself (produced using ggplot()), and I just cannot find the answer.

For example:

  1. The y-axis is ranging from 0 to 15,000.00 in plot 2, whereas, the correct data range should be between 0-120 as shown in plot 1 (when the code was working).
  2. The MAPE value in plot 2 is Inf %, when the correct MAPE value is 32.52 % (see plot 1 - produced with the R-code in the link above to my previous Stack Overflow question).

However, the only difference between my R-code in this question and my previous question was that I needed to add .001 to the x object (see R-code below) since the bsts() function would not accept zero values for the months of June and July (see the data frame below - BSTS_df).

When I run the MAPE object, I have noticed that the outcome is infinity (see below), which is very strange:

                      **MAPE
                      1  Inf**

I have also noticed that the values in the column UL produced from the posterior.interval object is ranging up to approximately 17,500.00, which I assume is not correct (see below):-

           LL          UL       Date
1  3.05009089  9214.58787 2017-01-01
2  2.74650830  7143.70646 2017-02-01
3  5.68858210  6537.04558 2017-03-01
4  2.76432668  2836.65981 2017-04-01
5  0.78042088  2249.52498 2017-05-01
6  0.04854900    88.57707 2017-06-01
7  0.03650557   145.12395 2017-07-01
8  2.70009631  5338.68317 2017-08-01
9  3.08234007  2329.57492 2017-09-01
10 2.26410227 17146.68785 2017-10-01
11 1.22190125 15561.66013 2017-11-01
12 2.14859047  3326.09852 2017-12-01

If anyone can help me fix these errors, I would be deeply appreciative since my R-code was working extremely well beforehand.

Many thanks in advance.

Plot 1

enter image description here

Plot 2

enter image description here

R-Code

##Open packages for the time series analysis

library(lubridate)
library(bsts)
library(dplyr)
library(ggplot2)
library(ggfortify)

###################################################################################
##Time Series Bayesian Inference Model with mcmc using the bsts() function
##################################################################################

##Change the Year column into YY/MM/DD format for the first of evey month per year
BSTS_df$Year <- lubridate::ymd(paste0(BSTS_df$Year, BSTS_df$Month,"-01"))

##Order the Year column in YY/MM/DD format into the correct sequence: Jan-Dec
allDates <- seq.Date(
                    min(BSTS_df$Year),
                    max(BSTS_df$Year),
                   "month")

##Produce and arrange the new data frame ordered by the first of evey month in YY/MM/DD format
BSTS_new_df <- BSTS_df %>%
                      right_join(data.frame(Year = allDates), by = c("Year")) %>%
                      dplyr::arrange(Year) %>%
                      tidyr::fill(Frequency, .direction = "down")

##Create a time series object
myts2 <- ts(BSTS_new_df$Frequency, start=c(2015, 1), end=c(2017, 12), frequency=12)

##Upload the data into the windows() function
x <- window(myts2, start=c(2015, 01), end=c(2016, 12))
y <- log(x+.001)

##Produce a list for the object ss
ss <- list()
ss <- AddSeasonal(ss, y, nseasons = 12)
ss <- AddLocalLevel(ss, y)

##Produce the bsts model using the bsts() function
bsts.model <- bsts(y, state.specification = ss,  niter = 100, ping = 0, seed = 123)

##Open plotting window
dev.new()

##Plot the bsts.model
plot(bsts.model)

##Get a suggested number of burns
burn<-bsts::SuggestBurn(0.1, bsts.model)

##Predict

p<-predict.bsts(bsts.model, horizon = 12, burn=burn, quantiles=c(.25, .975))

p$mean

##Actual vs predicted

d2 <- data.frame(
      # fitted values and predictions
      c(exp(as.numeric(-colMeans(bsts.model$one.step.prediction.errors[-(1:burn),])+y)),  
      exp(as.numeric(p$mean))),
      # actual data and dates
      as.numeric(BSTS_new_df$Frequency),
      as.Date(BSTS_new_df$Year))

###Rename the columns
names(d2) <- c("Fitted", "Actual", "Date")

### MAPE (mean absolute percentage error)
MAPE <- dplyr::filter(d2, lubridate::year(Date)>=2017) %>%
                                dplyr::summarise(MAPE=mean(abs(Actual-Fitted)/Actual))


### 95% forecast credible interval
posterior.interval <- cbind.data.frame(
                                  exp(as.numeric(p$interval[1,])),
                                  exp(as.numeric(p$interval[2,])),
                                  tail(d2,12)$Date)

##Rename the columns
names(posterior.interval) <- c("LL", "UL", "Date")

### Join intervals to the forecast
d3 <- left_join(d2, posterior.interval, by="Date")

##Open plotting window
dev.new()

### Plot actual versus predicted with credible intervals for the holdout period
ggplot(data=d3, aes(x=Date)) +
       geom_line(aes(y=Actual, colour = "Actual"), size=1.2) +
       geom_line(aes(y=Fitted, colour = "Fitted"), size=1.2, linetype=2) +
       theme_bw() + theme(legend.title = element_blank()) + ylab("") + xlab("") +
       geom_vline(xintercept=as.numeric(as.Date("2017-12-01")), linetype=2) +
       geom_ribbon(aes(ymin=LL, ymax=UL), fill="grey", alpha=0.5) +
       ggtitle(paste0("BSTS -- Holdout MAPE = ", round(100*MAPE,2), "%")) +
       theme(axis.text.x=element_text(angle = -90, hjust = 0))

Data Frame - BSTS_df

structure(list(Year = c(2015, 2015, 2015, 2015, 2015, 2015, 2015, 
2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 
2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 
2017, 2017, 2017, 2017, 2017, 2017, 2017), Month = structure(c(5L, 
4L, 8L, 1L, 9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 
9L, 7L, 6L, 2L, 12L, 11L, 10L, 3L, 5L, 4L, 8L, 1L, 9L, 7L, 6L, 
2L, 12L, 11L, 10L, 3L), .Label = c("April", "August", "December", 
"February", "January", "July", "June", "March", "May", "November", 
"October", "September"), class = "factor"), Frequency = c(36, 
28, 39, 46, 5, 0, 0, 22, 10, 15, 8, 33, 33, 29, 31, 23, 8, 9, 
7, 40, 41, 41, 30, 30, 44, 37, 41, 42, 20, 0, 7, 27, 35, 27, 
43, 38), Days_Sea_Month = c(31, 28, 31, 30, 6, 0, 0, 29, 15, 
29, 29, 31, 31, 29, 30, 30, 7, 0, 7, 30, 30, 31, 30, 27, 31, 
28, 30, 30, 21, 0, 7, 26, 29, 27, 29, 29)), row.names = c(NA, 
-36L), class = "data.frame")
1 Like

Thanks for the great reprex. The problem is here, doing the calculation of

6   0.2579823      0 2017-06-01        Inf

with abs(Actual-Fitted) / Actual due to division by 0

Hello, Technocrat. Nice to meet you again! Do you know how to adapt this part of the code to solve this issue? I really don't understand what's happened! My code was working absolutely fine beforehand and now it doesn't work. I really don't understand how my code calculated the MAPE as 32.52 % because the code is exactly the same as before. Many thanks if you can help here.

I knew one of the guys who was with the original UI team that founded Netscape, back in the Clinton administration. His tag was It was working 10 minutes ago, I swear! This type of experience is what drives wildly disorganized types like me to github, where I can prove to myself that it's exactly the same. So far, of course, that hasn't happened.

You have two choices for June 1, 2017—censor it or use an imputed value, perhaps the June 2015/2016 mean.

I just ran the code to that line; I'll go back and track where it actually gets used besides ggtitle to see if there's anyother effect and then re-read your question tomorrow about the graphs.

I can relate to those guys at Netscape, and yourself on Github. I have just been playing around with the posterior interval values, and I divided the column UL by 100. This adaptation to my code produced the desired plot with the actual and fitted values (I am not sure if this is the right method). However, it seems odd that I needed to make this effort when the original code produced the right posterior values, not these larger values ranging into the thousands. There must be a way to solve this issue is less processing steps.

I look forward to hearing from you tomorrow regarding how to solve the MAPE and posterior interval issues.

Very kind regards

My attempt at trying to solve the posterior interval issue

             #95% forecast credible interval
               posterior.interval <- cbind.data.frame(
                              exp(as.numeric(p$interval[1,])),
                              exp(as.numeric(p$interval[2,])),
                              tail(d2,12)$Date)

                ##Rename the columns
                names(posterior.interval) <- c("LL", "UL", "Date")

             ##Divide the posterior interval column named "UL" by 100
                New.posterior.interval<-as.data.frame(posterior.interval)

            ##Divide UL by 100
                 Divide.100.UL <- New.posterior.interval %>% 
                                    dplyr::mutate(New_UL = UL/100)

              ##Select the column New_UL instead of UL
                  Posterior.interval<-dplyr::select(Divide.100.UL, LL, New_UL, Date)

            ##Rename the columns to thier original column names
                 names(Posterior.interval) <- c("LL", "UL", "Date")

              ### Join intervals to the forecast
               d3 <- left_join(d2, Posterior.interval, by="Date")

               ##Open plotting window
               dev.new()

            ### Plot actual versus predicted with credible intervals for the holdout period
  
     ggplot(data=d3, aes(x=Date)) +
    geom_line(aes(y=Actual, colour = "Actual"), size=1.2) +
    geom_line(aes(y=Fitted, colour = "Fitted"), size=1.2, linetype=2) +
    theme_bw() + theme(legend.title = element_blank()) + ylab("") + xlab("") +
    geom_vline(xintercept=as.numeric(as.Date("2017-12-01")), linetype=2) +
    geom_ribbon(aes(ymin=LL, ymax=UL), fill="grey", alpha=0.5) +
    ggtitle(paste0("BSTS -- Holdout MAPE = ", round(100*MAPE,2), "%")) +
    theme(axis.text.x=element_text(angle = -90, hjust = 0))

New plot

Hello Technocrat,

I have been playing with my code and I had to add some intermediary steps, and I think I have estimated the MAPE value. However, I'm not sure if my code is correct, and I feel positive that the same plot and MAPE value can be calculated in fewer steps. I am still really curious why the posterior object is producing values in their thousands, and, why my previous code that worked beforehand stopped working. Strange!

        ##Change the Year column into YY/MM/DD format for the first of every month per year
        BSTS_df$Year <- lubridate::ymd(paste0(BSTS_df$Year, BSTS_df$Month,"-01"))

        ##Order the Year column in YY/MM/DD format into the correct sequence: Jan-Dec
         allDates <- seq.Date(
                              min(BSTS_df$Year),
                              max(BSTS_df$Year),
                              "month")

        ##Produce and arrange the new data frame ordered by the first of every month in YY/MM/DD format
                 BSTS_new_df <- BSTS_df %>%
                                      right_join(data.frame(Year = allDates), by = c("Year")) %>%
                                      dplyr::arrange(Year) %>%
                                      tidyr::fill(Frequency, .direction = "down")

        ##Create a time series object
        myts2 <- ts(BSTS_new_df$Frequency, start=c(2015, 1), end=c(2017, 12), frequency=12)

        ##Upload the data into the windows() function
        x <- window(myts2, start=c(2015, 01), end=c(2016, 12))
        y <- log(x+.001)

       ##Produce a list for the object ss
        ss <- list()
        ss <- AddSeasonal(ss, y, nseasons = 12)
        ss <- AddLocalLevel(ss, y)
   
        ##Produce the bsts model
        bsts.model <- bsts(y, state.specification = ss,  niter = 100, ping = 0, seed = 123)

        ##Open plotting window
        dev.new()

        ##Plot the bsts.model
        plot(bsts.model)

        ##Get a suggested number of burns
        burn<-bsts::SuggestBurn(0.1, bsts.model)

        ##Predict
        p<-predict.bsts(bsts.model, horizon = 12, burn=burn, quantiles=c(.25, .975))

        p$mean

         ##Actual vs predicted

         d2 <- data.frame(
            # fitted values and predictions
            c(exp(as.numeric(-colMeans(bsts.model$one.step.prediction.errors[-(1:burn),])+y)),  
            exp(as.numeric(p$mean))),
            # actual data and dates
            as.numeric(BSTS_new_df$Frequency),
            as.Date(BSTS_new_df$Year))

   ###Rename the columns
   names(d2) <- c("Fitted", "Actual", "Date")

    d2

   ##Sum the fitted column in d2
   Fitted_Sum_1<-sum(d2$Fitted)

   ##Sum the actual column in d2
   Actual_Sum_1<-sum(d2$Actual)

   ### MAPE(mean absolute percentage error)

   MAPE_1<-mean(abs(Actual_Sum_1-Fitted_Sum_1)/Actual_Sum_1)

   MAPE_1

  ### 95% forecast credible interval
  posterior.interval <- cbind.data.frame(
                              exp(as.numeric(p$interval[1,])),
                              exp(as.numeric(p$interval[2,])),
                              tail(d2,12)$Date)

   ##Rename the columns
  names(posterior.interval) <- c("LL", "UL", "Date")

   posterior.interval

    ##Divide the posterior interval column named "UL" by 100
   New.posterior.interval<-as.data.frame(posterior.interval)

   #Divide the column UL in the New.posterior.interval data frame by 1000
    Divide.1000.UL <- New.posterior.interval %>% 
                                             dplyr::mutate(New_UL = UL/1000)

 ##Select the column New_UL instead of UL
 Posterior.interval<-dplyr::select(Divide.1000.UL, LL, New_UL, Date)

 ##Rename the columns to thier original column names
  names(Posterior.interval) <- c("LL", "UL", "Date")

  Posterior.interval

 ### Join intervals to the forecast
 d3 <- left_join(d2, Posterior.interval, by="Date")

 ##Open plotting window
 dev.new()

 ### Plot actual versus predicted with credible intervals for the holdout period
   ggplot(data=d3, aes(x=Date)) +
   geom_line(aes(y=Actual, colour = "Actual"), size=1.2) +
   geom_line(aes(y=Fitted, colour = "Fitted"), size=1.2, linetype=2) +
   theme_bw() + theme(legend.title = element_blank()) + ylab("") + xlab("") +
   geom_vline(xintercept=as.numeric(as.Date("2017-12-01")), linetype=2) +
   geom_ribbon(aes(ymin=LL, ymax=UL), fill="grey", alpha=0.5) +
   ggtitle(paste0("BSTS -- Holdout MAPE = ", round(100*MAPE,2), "%")) +
   theme(axis.text.x=element_text(angle = -90, hjust = 0))

Plot with MAPE value

Some progress:

mape is calculated as the average of (actual-predicted) /abs(actual). This means that the function will return -Inf, Inf, or NaN if actual is zero. Due to the instability at or near zero, smape or mase are often used as alternatives.

{Metrics::mape}

Hi Technocrat,

I wrote to the designer of the bsts package and he diagnosed the problem as:

I think at least part of the problem is that you're trying to model the data as a time series of binomials (with "family = 'logit'"). The plots you're looking at are most likely trying to show you the logit of the binomial success probability at time t, which I don't think is what you meant to do. If it is, then I think you want to add another column to the analysis giving the number of trials in each time period. See the help for the 'bsts' function:

If the model family is logit, then there are two ways one can format the response variable. If the response is 0/1, TRUE/FALSE, or 1/-1, then the response variable can be passed as with any other model family. If the response is a set of counts out of a specified number of trials then it can be passed as a two-column matrix, where the first column contains the counts of successes and the second contains the count of failures.

Would you know how to solve this issue?

I would like to express my gratitude for your time, patience, and guidance.

Happy Christmas in advance

or is it count?

Merry Christmas to you, too!

Don't understand the purpose of the exp here.

Dear Technocrat,

My model is for counts and the data is Poisson distributed, not logit or binomial.

No! I played with that part of the code too and I don’t understand the point of the exp() function either. A friend wrote that part.

I also transformed my data using the function scale() so the mean = 0 and std = 1, which dramatically improved the plot. However, this recent correspondence makes me think the arguments in the bsts () model is incorrect for my question of inquiry. Logit vs Counts.

After I transformed the data, the MAPE function worked and provided a value of 29.52%: however, when I ran the code for the plot, the MAPE value became 295.20%. I think this is probably the round() function.

How would you write this code differently for count data?

Some progress!

Thank you for your time and consideration.

Hi, I've been working the code with my own data, which I've got a better sense of.

OK, observations are counts. That makes sense. I'd think that normalizing the data would make the model residuals white noise, but I'll check and look at the count data.

Hi Technocrat,

After the developer advised me to use a Poisson distribution, I had an attempt at writing some new code following this tutorial:

I hope this new code takes out the white noise, and I also ran this new code on untransformed data:-

The new problem is this error message when creating the d2 data frame. What's your opinion?

 ##Create a time series object
 myts2 <- ts(BSTS_new_df$Frequency, start=c(2015, 1), end=c(2017, 12), frequency=12)

 ##Upload the data into the windows() function
x <- window(myts2, start=c(2015, 01), end=c(2016, 12))
y <- x

##Produce a list for the object ss
ss <- list()

#ss <- AddLocalLinearTrend(list(), y)
ss <- AddSeasonal(ss, log1p(y), nseasons = 12)
ss <- AddLocalLevel(ss, y)

##Produce the model

bsts.model<-bsts(y, ss, niter=100, family="poisson", na.action=na.omit, ping=0, seed = 123)

##Open plotting window
 dev.new()

##Plot the bsts.model
plot(bsts.model)

##Get a suggested number of burns
burn<-bsts::SuggestBurn(0.1, bsts.model)

##Predict
p<-predict.bsts(bsts.model, horizon = 12, burn=burn, quantiles=c(.25, .975))

p$mean

  ### Actual versus predicted

  d2 <- data.frame(
  # fitted values and predictions
  c(exp(as.numeric(-colMeans(bsts.model$one.step.prediction.errors[-(1:burn),])+y)),  
  exp(as.numeric(p$mean))),
  # actual data and dates
  as.numeric(BSTS_new_df$Frequency),
  as.Date(BSTS_new_df$Year))

Error Message

    Error in attr(x, "tsp") <- c(1, NROW(x), 1) : 
    attempt to set an attribute on NULL

I don't have Frequency_blue, so I ran it on Frequency and didn't get an error. Hope that helps.

Whoops, typos for Frequency_blue. Sorry, I removed them!!! I quit R-Studio and ran my code again. This time there was no error; however, the posterior values are even worse now. When I use the function log1p() for the ss object, these values amplify by e+26. Under the circumstances, I am not sure if I need to log transform my data at all. I feel really confused about how to proceed! I'm absolutely determined to solve this problem.

Do you have any opinions with how to follow Steve's advice?

-If the response is a set of counts out of a specified number of trials then it can be passed as a two-column matrix, where the first column contains the counts of successes and the second contains the count of failures.

I would like to thank you for your patience and guidance.

I tried to run the new code with the transformed data using the function scale(), and I got an error message:

        ##Scale the data with the mean = 0, and the sd = 1
        BSTS_new_df$Frequency<-scale(BSTS_new_df$Frequency)

      ##Create a time series object
      myts2 <- ts(BSTS_new_df$Frequency_Blue, start=c(2015, 1), end=c(2017, 12), frequency=12)

       ##Upload the data into the windows() function
       x <- window(myts2, start=c(2015, 01), end=c(2016, 12))
       y <- x

     ##Produce a list for the object ss
      ss <- list()

     #ss <- AddLocalLinearTrend(list(), y)
      ss <- AddSeasonal(ss, y, nseasons = 12)
      ss <- AddLocalLevel(ss, y)

    ##Produce the model
   bsts.model<-bsts(y, state.specification = ss, niter=100, family="poisson", na.action=na.omit, ping=0, seed = 123)

Error Message:

         Error in .FormatBstsDataAndOptions(family, response, predictors, model.options,  : 
                                ncol(response) == 2 is not TRUE

Well it's clear that

FormatBstsDataAndOptions

Here's its source:


  ## Grab the timestamps for the response before passing it to
  ## .FormatBstsDataAndOptions so we can use them later.
  if (missing(data)) {
    ## This should be handled in the argument list by setting a default argument
    ## data = NULL, but doing that messes up the "regression black magic"
    ## section above.
    data <- NULL
  }
  timestamp.info <- TimestampInfo(response, data, timestamps)
  formatted.data.and.options <- .FormatBstsDataAndOptions(
      family, response, predictors, model.options, timestamp.info)
  data.list <- formatted.data.and.options$data.list
  model.options <- formatted.data.and.options$model.options

I guess I should ask what the counts are of, generally, and whether these are tests of present/not-present or actually continuous.

I've rerun this substituting 25 for the 0s (between the mean and median) and it appears to act normally when it didn't when we were using log(x + 0.01). So, I think there's a threshold for the minimum value of observations where everything goes, to use the technical term blooey.

From the bsts help

If no regressors are desired then the formula can be replaced by a numeric vector giving the time series to be modeled. Missing values are not allowed in predictors, but they are allowed in the response variable.

I wonder if "no missing values" should be "no missing or near-zero".

Hi Technocrat,

-These are counts of birds
-They were present

I hope you had a nice Christmas. I received further correspondence with the developer of the bsts package again and he made these suggestions:

-I notice that you're using log1p in the Seasonal component but not in the trend component. With the "Poisson" family, the modelling will be done on the log scale (log "lambda", not log Y, so 0's are allowed in the model).

-The default priors are chosen when you AddLocalLevel or AddSeasonal typically assume that you're working with Gaussian data. You may need to directly supply priors (e.g. using SdPrior) when you call AddSomeComponent. See the help pages for AddSeasonal for how to specify priors.

Therefore, I had a go at implementing his suggestions (please see below).

QUESTION: How do you know which prior values to select?

What's your opinion on my new attempt at getting this R-code right?

R-code:

     ##Open packages for the time series analysis

    library(lubridate)
    library(bsts)
    library(dplyr)
    library(ggplot2)
    library(ggfortify)
    library(forecast)

  ##Change the Year column into YY/MM/DD format for the first of evey month per year
  BSTS_df$Year <- lubridate::ymd(paste0(BSTS_df$Year, BSTS_df$Month,"-01"))

 ##Order the Year column in YY/MM/DD format into the correct sequence: Jan-Dec
  allDates <- seq.Date(
                min(BSTS_df$Year),
                max(BSTS_df$Year),
               "month")

 ##Produce and arrange the new data frame ordered by the first of evey month in YY/MM/DD format
 BSTS_new_df <- BSTS_df %>%
                  right_join(data.frame(Year = allDates), by = c("Year")) %>%
                  dplyr::arrange(Year) %>%
                  tidyr::fill(Frequency, .direction = "down")

##Create a time series object
myts2 <- ts(BSTS_new_df$Frequency, start=c(2015, 1), end=c(2017, 12), frequency=12)

##Upload the data into the windows() function
x <- window(myts2, start=c(2015, 01), end=c(2016, 12))
y <- x

   ##Set priors

  sdy <- sd(BSTS_new_df$Frequency)

  sigma.prior <- SdPrior(sigma.guess=0.01*sdy,
                        upper.limit=sdy,
                        sample.size=36)

 ##Produce a list for the object ss
 ss <- list()

#ss <- AddLocalLinearTrend(list(), y)
 ss <- AddSeasonal(ss, y, nseasons = 12)
 ss <- AddLocalLevel(ss, y, sigma.prior)  
                
#bsts.model <- bsts(y, state.specification = ss, family = "poisson", niter = 2, ping=0, seed=1234)

bsts.model<-bsts(y, state.specification = ss, niter=100, family="poisson", ping=0, seed = 123)

##Open plotting window
dev.new()

##Plot the bsts.model
plot(bsts.model)

##Make the predictions for the bsts model
predict<-predict(bsts.model, horizon=1)

predict

 ##Open plotting window
 dev.new()

 ##Plot the predictions
 plot(predict)

 ##Get a suggested number of burns
 burn<-bsts::SuggestBurn(0.1, bsts.model)

 ##Predict
 p<-predict.bsts(bsts.model, horizon = 12, burn=burn, quantiles=c(.25, .975))

 p$mean

##Actual vs predicted

  d2 <- data.frame(
           # fitted values and predictions
           c(as.numeric(-colMeans(bsts.model$one.step.prediction.errors[-(1:burn),])+y),  
           as.numeric(p$mean)),
           # actual data and dates
           as.numeric(BSTS_new_df$Frequency),
           as.Date(BSTS_new_df$Year))

  ###Rename the columns
  names(d2) <- c("Fitted", "Actual", "Date")

  d2

##Sum the fitted column in d2
Fitted_Sum<-sum(d2$Fitted)

##Sum the actual column in d2
Actual_Sum<-sum(d2$Actual)

###MAPE(mean absolute percentage error)

MAPE<-mean(abs(Actual_Sum-Fitted_Sum)/abs(Actual_Sum))

MAPE

###95% forecast credible interval
 posterior.interval <- cbind.data.frame(
                                  as.numeric(p$interval[1,]),
                                  as.numeric(p$interval[2,]),
                                  tail(d2,12)$Date)

##Rename the columns
names(posterior.interval) <- c("LL", "UL", "Date")

posterior.interval

###Join intervals to the forecast
d3 <- left_join(d2, posterior.interval, by="Date")

##Open plotting window
dev.new()

###Plot actual versus predicted with credible intervals for the holdout period

 ggplot(data=d3, aes(x=Date)) +
       geom_line(aes(y=Actual, colour = "Actual"), size=1.2) +
       geom_line(aes(y=Fitted, colour = "Fitted"), size=1.2, linetype=2) +
       theme_bw() + theme(legend.title = element_blank()) + ylab("") + xlab("") +
       geom_vline(xintercept=as.numeric(as.Date("2017-12-01")), linetype=2) +
       geom_ribbon(aes(ymin=LL, ymax=UL), fill="grey", alpha=0.5) +
       ggtitle(paste0("BSTS -- Holdout MAPE = ", round(100*MAPE,2), "%")) +
       theme(axis.text.x=element_text(angle = -90, hjust = 0))

Taking an intuitionist stance,

 BSTS_new_df %>% group_by(the_month = month(Year)) %>% summarize(Avg = mean(Frequency))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 12 x 2
   the_month   Avg
       <dbl> <dbl>
 1         1 37.7 
 2         2 31.3 
 3         3 37   
 4         4 37   
 5         5 11   
 6         6  3   
 7         7  4.67
 8         8 29.7 
 9         9 28.7 
10        10 27.7 
11        11 27   
12        12 33.7 

looks like it reflects a process that falls into three groups, with maybe February a little short in the count due to the number of days. So, I'd agree that a sd based prior of 14.63 is reasonable. So why reduce it to 0.146? Well, it seems that if it's not reduced the fitted values grow very high. I also tried fiddling with the upper.limit parameter and got some low MAPEs. I really don't feel like I have a good handle on the gamma concept. I'll look at other packages to see if I can pick up some insight.

Hi Technocrat.

I wrote to the developer of the bsts package and he suggested:

With a Poisson model the changes in the state equation describe "log lambda", so 'sdy' is probably a much larger number than you want for the prior standard deviation in the local level model. Also, the "sample.size" argument to SdPrior is the number of observations worth of weight you want to assign to your prior guess. If you set it equal to the same size in your data set then the data and your prior will each have equal weight. Most of the time you'd like the data to weigh much more. I think with this prior you've substantially overstated the instability in the time series you're modelling.

Could I please ask what syntax you used in your R-code to play with different upper limits to produce small MAPE values?

Question:

Is there a pre-processing method to set-up the Bayesian priors for the best model (i.e. after running many models) when you do not have clear priors?

Many thanks for your time and patience.