I try to build a loop with the ccf function, but unfortunately, I don't get very far. I have an Excel file ( external_data_train ) with 156 obs. of 50 variables. I would like to test the cross-correlation of each variable with the data set window$demand for 20 lags. I am therefore looking for a table output with lags in the header and each row for each variable. A nice add on would be a plot for each variable as well.
So far I used the following, but I it's not working at all:
k <- ncol(external_data_train)
ccf_mat <- matrix(0, nrow=37 , ncol= k)
for(i in 1:k) {
ccf_mat[,i] <- ccf(external_data_train[,i], window$demand)
}
I made a data frame with four column of 156 observations and calculated the ccf of each with a vector named Demand. The ccf function returns a list and I assumed you want the acf element of that list. See the help for ccf() to learn about the other components of the list.
I used the sapply function to iterate over the columns of the data frame. I calculated the ccf() of just the first column to provide a comparison of to the matrix output provided by sapply(). That step is not necessary.
The plots appear in the Plots pane of RStudio. If you have 56 variables, I do not thing they will all be available.
set.seed(1)
DF <- data.frame(A = runif(156), B = runif(156), C = runif(156), D = runif(156))
Demand <- runif(156)
#Manually calculate ccf() of first column
ccfOfA <- ccf(DF$A, Demand, lag.max = 20)$acf
head(ccfOfA)
#> [1] 0.162738631 -0.089532244 0.042114383 -0.014318872 -0.135267507
#> [6] -0.003926663
#Make matrix of ccf with Demand for all columns of DF
ccfOut <- sapply(DF, function(x) ccf(x, Demand, lag.max = 20)$acf)
Thanks for your reply.
I changed your code into: ccf_mat <- sapply(external_data_train , function(x) ccf(x, window$demand, lag.max = 20)$acf)
whereas external_data_train is the table with the different variables and window$demand is my general demand.
But I keep receiving the following error:
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Error in plot.window(...) : finite 'ylim' values necessary
Additionally: Warning messages:
1: In min(x) : no non-missing argument for min; return Inf
2: In max(x) : no non-missing argument for max; return -Inf```
I do not know why the plotting is failing. I would try running subsets of the columns, maybe 10 at a time and see if the plotting fails for particular column or if there is some other pattern.