I am working on predicting sales using some multiple independent variables and time series forecasting. I am still learning time series and you might see more questions related to this from me.
My data includes independent variables that come in different ranges. Because of this, when I try to plot all variables to see their trend, some of them have almost zero values in line plot. So, I am thinking if we can standardize all these variables except sales and month as I want to predict sales based on those independent variables. However, I am not sure how to proceed with that. In my example below, trend on variable A is clearly seen in the line plot, but C and D are at the very bottom and B is almost lost. I would like to see the trend for all these variables. Perhaps, all these 4 variables can have values in the same range. But not sure how to achieve that.
# Example Data df <- data.frame( stringsAsFactors = FALSE, month = c("2020 Jan","2020 Feb", "2020 Mar","2020 Apr","2020 May"), sales = c(2061292, 2087140, 2136628, 449335, 1105069), A = c(5067331.423,4856897.658, 4175123.217,3494987.878,3768201.526), B = c(153, 146, 115, 108, 133), C = c(58.247345, 50.548263, 30.994029, 20.521175, 28.040035), D = c(609026, 595426.8, 598968.2, 544902.2, 556805.2) ) # Creating tsibble df <- df%>% select(everything(), -sales)%>% gather(key = "factors", value = "value", -month)%>% mutate(month = yearmonth(month))%>% as_tsibble(key = `factors`, index = `month`) # Plotting Variables df%>% autoplot(value)
Hopefully, we can see trends for all independent variables once standardization takes place or please recommend some other efficient method. Once we do that, I hope we don't need to do anything with sales as that is the target variable and I don't want to make any changes to it.
Thanks for your help!