Hi Time Series Experts,
I have 6 columns : x1,x2,x3,x4,x5,y.
x1 is the date column
y is the output
x2,x3,x4 and x5 are different variables which influences y.
So, first 5 columns have impact on y outcome.
I would have used ARIMA is Date, X1 and Y are only in the dataset.
But other variables have influence on the y outcome. So, I dont want to remove them.
Is there a function/way to apply time series considering on 6 columns in the data ?
P.S: Used one-hot encoding to transform the categorical variable
We can simple apply auto.arima on AirPassengers dataset available from R packages but what if there are more than 1 variable influencing the output variable.
Let's consider we have airquality dataset in R:
We want to predict Wind
But it has Solar.R, Ozone, Temp which are influencing this output variable: Wind
How can can I use all these variables along with time: Month and Day
in order to predict Wind.
Although, found someone applied this, I could not grasp how they applied in Kaggle.
Code for forecasting AirPassengers
library(forecast) # Plot time series data plot(AirPassengers) autoplot(AirPassengers) + geom_smooth(method="lm")+ labs(x ="Date", y = "Passenger numbers (1000's)", title="Air Passengers from 1949 to 1961") # Apply Auto arima arimaAP <- auto.arima(AirPassengers) # Forecast next 36 months forecastAP <- forecast(arimaAP, level = c(95), h = 36) autoplot(forecastAP)
As we can see the AirPassengers data had only 1 variable dependent on time. Could you please guide how to deal with dataset which has more than 1 variable such as AirQuality data where there are more than 1 variable are influencing the output.
Thanks in advance,