Hi,
I am working on predicting sales using some multiple independent variables and time series forecasting. I am still learning time series and you might see more questions related to this from me.
My data includes independent variables that come in different ranges. Because of this, when I try to plot all variables to see their trend, some of them have almost zero values in line plot. So, I am thinking if we can standardize all these variables except sales and month as I want to predict sales based on those independent variables. However, I am not sure how to proceed with that. In my example below, trend on variable A is clearly seen in the line plot, but C and D are at the very bottom and B is almost lost. I would like to see the trend for all these variables. Perhaps, all these 4 variables can have values in the same range. But not sure how to achieve that.
# Example Data
df <- data.frame(
stringsAsFactors = FALSE,
month = c("2020 Jan","2020 Feb",
"2020 Mar","2020 Apr","2020 May"),
sales = c(2061292, 2087140, 2136628, 449335, 1105069),
A = c(5067331.423,4856897.658,
4175123.217,3494987.878,3768201.526),
B = c(153, 146, 115, 108, 133),
C = c(58.247345, 50.548263, 30.994029, 20.521175, 28.040035),
D = c(609026, 595426.8, 598968.2, 544902.2, 556805.2)
)
# Creating tsibble
df <- df%>%
select(everything(), -sales)%>%
gather(key = "factors", value = "value", -month)%>%
mutate(month = yearmonth(month))%>%
as_tsibble(key = `factors`, index = `month`)
# Plotting Variables
df%>%
autoplot(value)
Hopefully, we can see trends for all independent variables once standardization takes place or please recommend some other efficient method. Once we do that, I hope we don't need to do anything with sales as that is the target variable and I don't want to make any changes to it.
Thanks for your help!