Hi Time Series Experts,

I have 6 columns : x1,x2,x3,x4,x5,y.

x1 is the date column

y is the output

x2,x3,x4 and x5 are different variables which influences y.

So, first 5 columns have impact on y outcome.

I would have used ARIMA is Date, X1 and Y are only in the dataset.

Eg: [http://rstudio-pubs-static.s3.amazonaws.com/311446_08b00d63cc794e158b1f4763eb70d43a.html]

But other variables have influence on the y outcome. So, I dont want to remove them.

Is there a function/way to apply time series considering on 6 columns in the data ?

P.S: Used one-hot encoding to transform the categorical variable

**Reprex**

We can simple apply auto.arima on AirPassengers dataset available from R packages but what if there are more than 1 variable influencing the output variable.

Let's consider we have airquality dataset in R:

We want to predict Wind

But it has Solar.R, Ozone, Temp which are influencing this output variable: Wind

How can can I use all these variables along with time: Month and Day

in order to predict Wind.

Although, found someone applied this, I could not grasp how they applied in Kaggle.

https://www.kaggle.com/raenish/time-series-on-air-quality/code

**Code for forecasting AirPassengers**

```
library(forecast)
# Plot time series data
plot(AirPassengers)
autoplot(AirPassengers) + geom_smooth(method="lm")+ labs(x ="Date", y = "Passenger numbers (1000's)", title="Air Passengers from 1949 to 1961")
# Apply Auto arima
arimaAP <- auto.arima(AirPassengers)
# Forecast next 36 months
forecastAP <- forecast(arimaAP, level = c(95), h = 36)
autoplot(forecastAP)
```

As we can see the AirPassengers data had only 1 variable dependent on time. Could you please guide how to deal with dataset which has more than 1 variable such as AirQuality data where there are more than 1 variable are influencing the output.

Thanks in advance,

Abi