the dataset requirements using fable--Tidy forecasting in R

Hi, I am trying out the fable package by using this dataset but seems do not work.
I was trying to use : fable::ARIMA() model for 5*6 =30 series
I think the reason could be the dataset does not fit the requirements of the model.
so I want to ask, what are the requirements of the dataset then I can prepare my dataset towards the requirements.

Try to attach the dataset sample for the info but cannot upload in excel for csv formate.and i am not able also upload the pdf version of this dataset due to I am a new user.... ???
so only able to upload the screen shot

and also have specific questions as follow:

Regarding the dataset: my try out based on the video I watched here https://www.youtube.com/watch?v=6v3_AsbhqrE

  1. Do we have number of ‘series’ limitation? If I have about 300 or 400 serise?

  2. For each series do we have time continues requirements? E.g. say having one series called ‘cafe’ I try to have 0ne year transection data for sales, but for 356 days, I only have 278 days have sales the other days either missing or sales is zero. So in the datafram, should I put back all the 365 days for this ‘cafe’ series? Or I can just leave it be?

  3. For different series, have different sales date: e.g. for ‘cafe’ I have 01-01, 01-02, and 01-05… but for ‘Butter’ I have 01-01, 02-01, 02-03….so the date is different and the butter I only have 159 days data
    Dose that matter?

  4. How many category columns , could we add? E.g. in the example you use 2 column, for each ‘state’ and ‘industry’ you have one series, could I add more dimension, e.g have one ‘category’ café in the city, café for lunch, …(sub industry category) and have one ‘city name’ as sub ‘state’ dimension, and now we need to use four columns, for each ‘state’ each ‘city’ , each ‘industry’ and each ’category’ have one series. Can the package dealing with 4 dimensions or even more say 6 dimensions? Or I had to prepare the data, make those 4 or 6 dimensions to only 2 columns?

Hope to get some guidance

Thank you!

The fundamental problem you face is clearly defining your observations. In a time series, you will end up with time as the x axis, corresponding to Date in your source data. What will be the y axis? Aggregate sales of all products by all stores per date? Or is it something else? What your time series will show is essential to being able to advise on how the data should be organized.

Second, you will get a much better response if you include a reproducible example, called a reprex, and identify, in this case the origin of (fable) (https://github.com/tidyverts/fable), which is not on CRAN.

Third, read the documentation for fable closely. It expects input as a tibble. To transform your data into a tibble, you need to understand that the underlying principle is one row per characteristic of interest, whether it be store_number, product_number or some constructed variable from your data. tibble requires it and fable requires a tibble object.

This is why the y axis identification is so critical.

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.