I would like to use Machine learning models on top of multivariate time series data to forecast long horizons (for example 400 items and their historical sales in the last year & content features).
From many papers, blogs and Kaggle notebooks I understood that the time series must be stationary, before I am using classical ML algorithms. The reason that ML models such XG-boost\ Cat-Boost can't extrapolate to the feature .
If I enforce the variance and mean being stationary & adding seasonal attributes ( such as LAGS ) , then it should be fine
To make variance stationary I can use log , or Box-Cox power transformation.
Though for mean , I can’t find a practical approach to enforce stationarity . I tried to use differencing – but since I have a long horizon ( such as future 90 points ) I got very bad results.
And I do familiar with two types of trend : stochastic and deterministic
some one can assist with how to enforce stationary to the mean , and then transform back the predictions to their original scale ? And if anyone has some Python code example to such task it will be great !