Should I truncate some observation points of a time series before producing forecast?

Hello Everyone, I am working on hourly forecasts for shops and restaurants. Sometimes, I am confused about the combination of a number of observations of a time series and its frequency. Example: I have a set of 10 different observation data in a time series such as observeData <- c(2,3,5,7,4,6,8,5,7,9) and frequency of time series is 3. As the frequency is 3, before putting observeData to a forecast algorithm auto.arima(), should I truncate the number of observeData from 10 (3x3 + 1) to 9 (3x3)?. I mean from observeData <- c(2,3,5,7,4,6,8,5,7,9) to observeData <- c(3,5,7,4,6,8,5,7,9). Intuition said, auto.arima() or other forecast algorithms need a rectangular matrix data. And 10 (3x3 + 1) observation data can not form a rectangular matrix for computation while an observed data of 9 (3x3) can form a rectangular matrix.

  1. How can hourly data give you a frequency of 3?
  2. There is no need to truncate the data.

Dear Rob J Hyndman, Thanks for your question. It's nice that I have found you in this community. I am following your book Forecasting Principles and Practice (2nd edition) since 2020 for most of my implementation. There is a little story behind finding this book. Anyway, Let me give the answer to your question.

Answer to your question: Imagine, a restaurant opens only 3 days per week (Monday, Wednesday and Friday) and each day it's opening hour is from 11.00h to 14.00h. So, the hourly frequency is 3 instead of 24 and daily frequency is also three instead of 7. If there is a mistake please, explain it to me. I am learning and open a new startup business to analysis minute sales data. In my business (http://www.sale4cast.com/), we collect minute historical sales data from customers. Once minute historical sales data of 2 months is collected, we aggregate minute sales data into hourly sales data and slice the hourly sales data according to different weekdays. Because, different weekdays has different opening hours of a restaurant. For example, current Wednesday data goes with previous Wednesday data to make a Wednesday hourly time series. And then produce hourly sales forecasts for different weekdays.

In the subsection 2.1 of Forecasting Principles and Practice (2nd edition), I believe, daily frequency for a week was considered as 7 and a weekly period was 52 instead of 52.18. So, a rectangular matrix 7x52 is arrived at for computation instead of 7x52.18.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.