Holt winters forecast -need code corrected

I have the following code which contains errors

Classes‘tbl_df’, ‘tbl’ and 'data.frame': 341 obs. of 2 variables:
month: Date, format: "1982-06-12" "1982-06-19" ... sales: num 14237 4520 55691 28295 23648 ...

  • attr(*, "spec")=
    .. cols(
    .. month = col_date(format = ""),
    .. sales = col_double()

The below quote provides me the Holtwinters forecast

load needed packages

library(readr)
library(dplyr)
library(ggplot2)
library(forecast)

import data

sales <- read_csv("../input/MonthlySales.csv")

explore data

str(sales)
head(sales, n = 5)
Classes‘tbl_df’, ‘tbl’ and 'data.frame': 341 obs. of 2 variables:
month: Date, format: "1982-06-12" "1982-06-19" ... sales: num 14237 4520 55691 28295 23648 ...

  • attr(*, "spec")=
    .. cols(
    .. month = col_date(format = ""),
    .. sales = col_double()
    .. )
    options(repr.plot.width = 6, repr.plot.height = 3)
    ggplot(sales, aes(x = month, y = sales)) + geom_line() + geom_smooth(method = 'lm') +labs(x = "Time", y = "Monthly Sales")
    salesTS <- ts(sales$sales, frequency = 4, start = c(1982,1))
    class(salesTS)
    'ts'
    options(repr.plot.width = 6, repr.plot.height = 5)
    salesDecomp <- decompose(salesTS)
    plot(salesDecomp)

logging transform time series data

salesLog <- log(salesTS)

salesLogHW <- HoltWinters(salesLog)
salesLogHW
Call:
HoltWinters(x = salesLog)

options(repr.plot.width = 6, repr.plot.height = 4)
plot(salesLogHW)

forecast next year's sales

nextYearSales <- forecast(salesLogHW, h=4)

plot

plot(nextYearSales)
nextYearSales

my MonthlySales table has the followingdata ( i will show only the first 40 lines)

Month sales
1982-06-12 1
1983-06-19 6
1983-06-20 2
1983-06-21 4
1983-06-22 3
1983-06-23 5
1983-06-24 8
1983-06-25 9
1983-06-26 4
1983-06-27 5
1983-06-28 6
1983-06-29 1
1982-09-04 5
1982-09-05 3
1982-09-06 4
1982-09-07 6
1982-09-08 5
1982-09-09 6
1982-09-10 4
1982-09-11 1
1982-09-12 2
1982-09-13 2
1982-09-14 2
1983-11-20 1
1983-11-21 3
1983-11-22 4
1983-11-23 4
1983-11-24 7
1983-11-25 10
1983-11-26 10
1983-11-27 9
1983-11-28 7
1983-11-29 8
1983-11-30 2
1983-12-01 3
1983-12-02 4
1983-12-03 1
1983-12-04 1
1983-12-05 1

is there anyone in R studio community who can correct the incorrect section?

First, let's look at your data. The Month column actually contains dates that are sometimes one day apart and sometimes a few months apart. It's not clear what that column should contain, but I don't think what you show is correct.

When you create the salesTS object you give it frequency=4 which would be appropriate for quarterly data, but the frequency should be 12 for monthly data.

The code to produce the forecasts should work ok.

Since you take logs before fitting the model, presumably you want forecasts back on the original scale, not on the log scale. You can easily do that by replacing the HoltWinters command with the equivalent model computed using the ets command. Here is an example. Note that lambda=0 is the same as taking logs.

library(ggplot2)
library(forecast)

sales <- data.frame(
  Month = c(
    "1982-06-12", "1983-06-19", "1983-06-20", "1983-06-21", "1983-06-22", 
    "1983-06-23", "1983-06-24", "1983-06-25", "1983-06-26", "1983-06-27", 
    "1983-06-28", "1983-06-29", "1982-09-04", "1982-09-05", "1982-09-06", 
    "1982-09-07", "1982-09-08", "1982-09-09", "1982-09-10", "1982-09-11", 
    "1982-09-12", "1982-09-13", "1982-09-14", "1983-11-20", "1983-11-21", 
    "1983-11-22", "1983-11-23", "1983-11-24", "1983-11-25", "1983-11-26", 
    "1983-11-27", "1983-11-28", "1983-11-29", "1983-11-30", "1983-12-01", 
    "1983-12-02", "1983-12-03", "1983-12-04", "1983-12-05"
  ),
  sales = c(1L, 6L, 2L, 4L, 3L, 5L, 8L, 9L, 4L, 5L, 6L, 1L, 5L, 3L, 4L, 6L, 
            5L, 6L, 4L, 1L, 2L, 2L, 2L, 1L, 3L, 4L, 4L, 7L, 10L, 10L, 9L, 7L, 
            8L, 2L, 3L, 4L, 1L, 1L, 1L)
)

salesTS <- ts(sales$sales, frequency = 12, start = c(1982,1))
salesLogHW <- ets(salesTS, model="AAA", lambda=0)
nextYearSales <- forecast(salesLogHW, h=4)
autoplot(nextYearSales)

nextYearSales
#>          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#> Apr 1985       5.103658 1.866121 13.95800 1.095562 23.77531
#> May 1985       5.554407 2.030935 15.19076 1.192320 25.87513
#> Jun 1985       4.704604 1.720209 12.86663 1.009900 21.91633
#> Jul 1985       5.758174 2.105440 15.74804 1.236061 26.82437

Created on 2020-04-20 by the reprex package (v0.3.0)

Hi Rob. Thank you very much for your support.The Sales are kinda tweaked as I can not publish our realistic sales.
A frequency of 3 should be fine as that takes the average per 3 .

I am trying to add some extra values to the Next years Sales table.
Like:
Alpha ,Beta and Gamma values (ABG) per line (quarterly values in this case) and the MSE value per line. The ABG values should be rolling per every 3 values
Can you instruct/teach me how I can do that?

Cheers

The frequency is usually the number of observations per year. So I don't think it should be 3. If it is monthly data it should be 12. If it is quarterly data it should be 4. If it is daily data it should be 365. But you haven't explained what the Month column means, so it is impossible to know.

You should be able to replace the sales column with random numbers to preserve confidentiality and then post the resulting data set. Like this.

sales$sales <- rnorm(NROW(sales))
dput(sales)

Then paste the resulting output here so we can actually see the data structure properly.

I don't understand what you require. It sounds like you want information about the fitted model as well as forecasts formatted in a table, but what does "rolling per every 3 values" mean?

Hi Rob.
it is monthly data, so 12 .
ROLLING FORECAST

A rolling forecast per every 3 values means, that :

it forecasts first over for values
2 , 3 ,5 , 6,7,9.10,17,12,13,16
Then again over 3 ,5 , 6…..then over 5 , 6,7…then 6,7,9…etc.etc.
So it skips every time 1 value forward, but taking the next 3 values into account.

NEW LINES OF SALES
I have set a new line of sales up.

The table output i am looking for is something like this:
(you might have to copy it in excel to see what it looks like)

Date sales α δ γ Forecast α Forecast δ Forecast γ
12/06/1982 2 n/a
13/06/1982 3
14/06/1982 5
15/06/1982 6 0.145309691 0.125029156 0.15381477
16/06/1982 7 0.116138478 0.113240008 0.130951664
17/06/1982 9 0.120159054 0.107930279 0.132951374
18/06/1982 10 0.142208382 0.094449719 0.235930509 0.069173091 0.10798745 0.209918475
19/06/1982 17 0.069173091 0.10798745 0.209918475 0.08073675 0.093865212 0.237869234
20/06/1982 12 0.08073675 0.093865212 0.237869234 0.068116421 0.088577842 0.25958786
21/06/1982 13 0.068116421 0.088577842 0.25958786 0.055496091 0.083290471 0.281306485
22/06/1982 16 0.055496091 0.083290471 0.281306485 0.042875762 0.078003101 0.303025111
23/06/1982 11 0.042875762 0.078003101 0.303025111 0.030255432 0.072715731 0.324743736
24/06/1982 7 0.030255432 0.072715731 0.324743736 0.017635102 0.067428361 0.346462362
25/06/1982 25 0.017635102 0.067428361 0.346462362 0.005014773 0.062140991 0.368180987
26/06/1982 8 0.005014773 0.062140991 0.368180987 -0.007605557 0.056853621 0.389899612
27/06/1982 9 -0.007605557 0.056853621 0.389899612 -0.020225886 0.051566251 0.411618238
28/06/1982 4 -0.020225886 0.051566251 0.411618238 -0.032846216 0.046278881 0.433336863
29/06/1982 13 -0.032846216 0.046278881 0.433336863 -0.045466546 0.04099151 0.455055489
30/06/1982 11 -0.045466546 0.04099151 0.455055489 -0.058086875 0.03570414 0.476774114
01/07/1982 7 -0.058086875 0.03570414 0.476774114 -0.070707205 0.03041677 0.498492739
02/07/1982 11 -0.070707205 0.03041677 0.498492739 -0.083327534 0.0251294 0.520211365
03/07/1982 8 -0.083327534 0.0251294 0.520211365 -0.095947864 0.01984203 0.54192999
04/07/1982 4 -0.095947864 0.01984203 0.54192999 -0.108568193 0.01455466 0.563648616
05/07/1982 3 -0.108568193 0.01455466 0.563648616 -0.121188523 0.00926729 0.585367241
06/07/1982 5 -0.121188523 0.00926729 0.585367241 -0.133808853 0.003979919 0.607085867
07/07/1982 6 -0.133808853 0.003979919 0.607085867 -0.146429182 -0.001307451 0.628804492
08/07/1982 4 -0.146429182 -0.001307451 0.628804492 -0.159049512 -0.006594821 0.650523117
09/07/1982 7 -0.159049512 -0.006594821 0.650523117 -0.171669841 -0.011882191 0.672241743
10/07/1982 11 -0.171669841 -0.011882191 0.672241743 -0.184290171 -0.017169561 0.693960368

As I keep saying, you claim it is monthly but you keep showing data which has a Date column showing days not months.

Second, the parameters alpha, beta and gamma relate to the fitted model. You don't forecast them directly, you forecast sales instead using the fitted model.

Finally, you can't fit a model with 3 parameters to only 3 observations.

Hi, i checked.Our sales guy claims its a monthly production forecast on a daily basis from one of his customers.So i guess its daily after all.
If i wish to forecast ABG directly ,would that be possible in R?
My excel table has more than 2000 observations

I give up. You aren't defining the problem clearly enough for me to be able to help.

1 Like

Hi Rob.
I am truly sorry. I am not good at explaining.

Problem is that we currently use excel for estimating the baseline , trend and season for the sales/production.Our internal Production is estimated for the following day (which is the total sales for a month for a certain customer.This is why i originally thought it was a monthly number.

That excel table looks like this:

Day Sales α accurate δ accurate γ accurate expected α expected δ expected γ Forecast sales MSE
06-Feb 3
07-Feb 8 0.142208382 0.094449719 0.235930509 4 4.999062106
08-Feb 1 0.142208382 0.094449719 0.235930509 0.069173091 0.10798745 0.209918475 2 0.193212972
09-Feb 3 0.069173091 0.10798745 0.209918475 0.069173091 0.10798745 0.209918475 3 1.473636406
10-Feb 5 0.069173091 0.10798745 0.209918475 0.145309691 0.125029156 0.15381477 4 4.509703765
11-Feb 8 0.145309691 0.125029156 0.15381477 0.145309691 0.125029156 0.15381477 5 7.978798308
12-Feb 3 0.145309691 0.125029156 0.15381477 0.116138478 0.113240008 0.130951664 6 13.46953186
13-Feb 7 0.116138478 0.113240008 0.130951664 0.120159054 0.107930279 0.132951374 4 6.430041868
14-Feb 5 0.120159054 0.107930279 0.132951374 5 4.489021233

My opinion is that R can do a way better job in estimating the Alpha, Beta and Gamma ,leading to a fitted model by using the Holtwinters or ETS functions in R.
I also believe that the estimates aremore accurate.But these are Hypothesis.Ii have to prove that the zero Hypothesis is incorrect.Truly a challenge.
Thats why I need to see a table with the moving ABG values in a table.So yes, i would use a fitted model at the end of the day.But would need to see the details as to how that final ABG for the time series is calculated.If the ABG is shown per line(moving value per 20 lines instead of 3 ?) and the final ABG is shown , lets say for 20 ines , that final ABG should be a' summary' of the former 19 lines.it should be a rolling value though,I believe its called a ;moving' ABG value per 20 lines I need.

Not sure if i explained it wel enough, but i basically need a table calculated in R per sales-line showing the Alpha , Beta and gamma value.If that needs 20 lines as a basis (Instead of 3) , thats fine as well.I do not know how many lines would be required to show valid ABG values.My current database shows 2000 lines,so a moving ABG based on every 20 lines is also ok.

so, again, i just need a table calculating and showing the ABG values .moving (rolling) per 20 lines.Like the excel eample, but this time in R code.
So far , i have R code which shows me the end ABG per a total times serie.(pls see my former messages)

Hope the above heps
And btw
I really appreciate your help :slight_smile:

cheers

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.