ccf loop with multiple variables

Hello everyone,

I try to build a loop with the ccf function, but unfortunately, I don't get very far. I have an Excel file ( external_data_train ) with 156 obs. of 50 variables. I would like to test the cross-correlation of each variable with the data set window$demand for 20 lags. I am therefore looking for a table output with lags in the header and each row for each variable. A nice add on would be a plot for each variable as well.

So far I used the following, but I it's not working at all:

k <- ncol(external_data_train)
ccf_mat <- matrix(0, nrow=37 , ncol= k)
for(i in 1:k) {
  ccf_mat[,i] <- ccf(external_data_train[,i], window$demand)
}

instead of

ccfvalues1 = ccf(external_data_train$F1,window$demand, 20)
ccfvalues2 = ccf(external_data_train$F2,window$demand, 20)
ccfvalues3 = ccf(external_data_train$F3,window$demand, 20)
ccfvalues4 = ccf(external_data_train$F4,window$demand, 20)
ccfvalues5 = ccf(external_data_train$F5,window$demand, 20)
ccfvalues6 = ccf(external_data_train$F6,window$demand, 20)
ccfvalues7 = ccf(external_data_train$F7,window$demand, 20)
ccfvalues8 = ccf(external_data_train$F8,window$demand, 20)
ccfvalues9 = ccf(external_data_train$F9,window$demand, 20)

Thanks for your help!
Luke

I made a data frame with four column of 156 observations and calculated the ccf of each with a vector named Demand. The ccf function returns a list and I assumed you want the acf element of that list. See the help for ccf() to learn about the other components of the list.
I used the sapply function to iterate over the columns of the data frame. I calculated the ccf() of just the first column to provide a comparison of to the matrix output provided by sapply(). That step is not necessary.
The plots appear in the Plots pane of RStudio. If you have 56 variables, I do not thing they will all be available.

set.seed(1)
DF <- data.frame(A = runif(156), B = runif(156), C = runif(156), D = runif(156))
Demand <- runif(156)
#Manually calculate ccf() of first column
ccfOfA <- ccf(DF$A, Demand, lag.max = 20)$acf

head(ccfOfA)
#> [1]  0.162738631 -0.089532244  0.042114383 -0.014318872 -0.135267507
#> [6] -0.003926663

#Make matrix of ccf with Demand for all columns of DF
ccfOut <- sapply(DF, function(x) ccf(x, Demand, lag.max = 20)$acf)

head(ccfOut[, 1])
#> [1]  0.162738631 -0.089532244  0.042114383 -0.014318872 -0.135267507
#> [6] -0.003926663

Created on 2020-08-04 by the reprex package (v0.3.0)

Thanks for your reply.
I changed your code into:
ccf_mat <- sapply(external_data_train , function(x) ccf(x, window$demand, lag.max = 20)$acf)
whereas external_data_train is the table with the different variables and window$demand is my general demand.

But I keep receiving the following error:

Error: no more error handlers available (recursive errors?); invoking 'abort' restart

Error in plot.window(...) : finite 'ylim' values necessary
Additionally: Warning messages:
1: In min(x) : no non-missing argument for min; return Inf
2: In max(x) : no non-missing argument for max; return -Inf```

Try running with the plot turned off and see what the output looks like.

ccf_mat <- sapply(external_data_train , function(x) ccf(x, window$demand, lag.max = 20, plot = FALSE)$acf)

I suspect that ccf is returning vectors that are all NA or there is some similar problem.

1 Like

When I use plot= FALSE it does return correct values. Do you have an idea why plotting them does not work?

I do not know why the plotting is failing. I would try running subsets of the columns, maybe 10 at a time and see if the plotting fails for particular column or if there is some other pattern.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.