Running Time Series Models in Parallel

omario · February 28, 2023, 12:22am

I have the following dataset:

countries <- c(1, 2)
 
cities <- c("A", "B")   

weeks <- rep(seq(as.Date("2010-01-01"), as.Date("2023-01-01"), by="week"), length(countries)*length(cities))
 
counts <- rpois(length(weeks), lambda = 50)

df <- data.frame(
  Country = rep(countries, each = length(weeks)/(length(countries)*length(cities))),
  City = rep(cities, length(weeks)/(length(countries)*length(cities))),
  Week = as.character(weeks),
  Count = counts
)

I want to fit 4 time series models (auto.arima) to Country1-CityA, Country1-CityB, Country2-CityA and Country2-CityB

I did this with the following code:

library(dplyr)
library(forecast)

#Group data by Country and City
df_grouped <- df %>% group_by(Country, City)

# Define a function to fit ARIMA model for each group
fit_arima <- function(df) {
  arima_model <- auto.arima(df$Count)
  return(arima_model)
}

# Apply the function to each group using lapply
arima_models <- df_grouped %>% nest() %>% 
  mutate(arima_model = lapply(data, fit_arima))

# Extract models for each group
arima_model_a_a <- arima_models$arima_model[[1]]
arima_model_a_b <- arima_models$arima_model[[2]]
arima_model_b_a <- arima_models$arima_model[[3]]
arima_model_b_b <- arima_models$arima_model[[4]]

My Question: Now, I want to improve the performance of this code using "parallel computing".

I read that there are different options for this task - this includes libraries such as "foreach", "doParallel", "doSNOW", "clustereval" etc.

I tried to adapt some of these methodologies to my problem:

# Define the number of cores to use
no_cores <- detectCores() - 1

# Create a cluster 
cl <- makeCluster(no_cores)

# Bundle all libraries needed
clusterEvalQ(cl, {
  library(forecast)
})


# Use parLapply to fit ARIMA models in parallel
ARIMA_Models <- parLapply(cl, ARIMA_List, function(x) {
  auto.arima(x$Count)
})

# EXIT
stopCluster(cl)

The code seems to have run - but can someone please help me understand if I have done this correctly? Is there a better way to accomplish this?

Thanks!

system · March 21, 2023, 12:22am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.