I've put together an example using your code to implement this.
Load packages
library(fable)
#> Loading required package: fabletools
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tsibble)
#>
#> Attaching package: 'tsibble'
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, union
Create some sample data
alldata <- as_tsibble(USAccDeaths) %>%
rename(Depvar = value) %>%
mutate(!!!lapply(setNames(letters[1:4], letters[1:4]), function(cn) expr(rnorm(72))))
# Note, head() is suitable here since there is just 1 key, otherwise you need group_by() %>% slice()
data_trian <- head(alldata, -12)
Adapting your code with minimal changes
# Get names of predictors
Cols <- names(alldata)
Cols <- Cols[!(Cols %in% c("index", "Depvar"))] # Better not to consider the index as a regressor - you can use trend() if you like instead.
n <- length(Cols)
# Construct all possible combinations
id <- unlist(
lapply(1:n,
function(i) combn(1:n,i,simplify = FALSE))
, recursive = FALSE)
# Paste them into formulas
frmlas <- sapply(id, function(i)
paste("Depvar ~", paste(Cols[i], collapse = "+")))
# Convert to formulas
frmlas <- lapply(frmlas, as.formula)
# 1st attemp iterating using lapply
fitted_models = lapply(frmlas, function(frml) model(data_trian, ARIMA(!!frml)))
# 2nd attempt iterating using loop
fit <- list()
for (i in 1:length(frmlas)) {
fit[[i]] <- data_trian %>%
model(ARIMA(!!frmlas[[i]]))
}
There are two missing elements in your code.
- Your formulas are just
character
text, they need to be parsed into formulas with something like as.formula()
- Your attempted use of the model formulas in
lappy()
is looking for a variable called frml
, and in the for
loop it is trying to calculate a variable frmlas[i]
. Instead you want to use !!
to say ‘use the value of this object’ rather than the name/expression. You can think of !!
as replacing whatever is on the right with it’s value, so ARIMA(!!frml)
becomes ARIMA(Depvar ~ a)
.
How I would do it
# Get names of predictors
Cols <- measured_vars(alldata)
Cols <- setdiff(Cols, "Depvar")
# Get all combinations of regressors, each regressor can be included or removed (represented with NULL)
comb_xreg <- expand.grid(lapply(Cols, list, NULL))
# Simplify into a list of included regressors
comb_xreg <- split(comb_xreg, seq_len(nrow(comb_xreg))) %>%
# Simplify into vectors, removing NULL
lapply(unlist, use.names = FALSE) %>%
# Convert to list of symbols/names (this is used to identify a variable `a` instead of the text "a")
lapply(syms)
# Create formulas from regressors
rhs <- comb_xreg %>%
lapply(Reduce, f = function(x, y) call("+", x, y))
formulas <- lapply(rhs, function(x) {
# If there are no regressors, just return the response variable without a formula
if(is.null(x)) sym("Depvar") else rlang::new_formula(sym("Depvar"), x)
})
# Create model definitions
mdl_defs <- lapply(formulas, function(frml) ARIMA(!!frml))
# Estimate the models
fit <- data_trian %>%
model(!!!mdl_defs)
# The names from these models are the list names produced by `split()`, you can change these to be whatever you like.
fit
#> # A mable: 1 x 16
#> `1` `2`
#> <model> <model>
#> 1 <LM w/ ARIMA(1,0,1)(0,1,1)[12] errors> <LM w/ ARIMA(0,1,1)(0,1,1)[12] errors>
#> # … with 14 more variables: `3` <model>, `4` <model>, `5` <model>, `6` <model>,
#> # `7` <model>, `8` <model>, `9` <model>, `10` <model>, `11` <model>,
#> # `12` <model>, `13` <model>, `14` <model>, `15` <model>, `16` <model>
My approach is more robust as it programmatically produces the model formula, rather than hoping that the R parser converts your string correctly in as.formula()
. For example, if your data contained a variable named “a+b”, then as.formula()
would incorrectly produce Depvar ~ a + b
instead of Depvar ~ `a+b`
. My approach should work fine for all sorts of syntactically invalid names.
It also keeps all of the models in a single mable, allowing you to immediately compare them with functions like accuracy()
accuracy(fit)
#> # A tibble: 16 × 10
#> .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 Training -19.9 275. 188. -0.254 2.22 0.391 0.457 -0.0324
#> 2 2 Training 56.3 297. 210. 0.656 2.46 0.435 0.494 -0.0225
#> 3 3 Training 49.9 279. 190. 0.576 2.22 0.394 0.464 -0.0147
#> 4 4 Training 55.9 298. 211. 0.646 2.47 0.437 0.496 -0.0269
#> 5 5 Training -20.9 274. 185. -0.263 2.18 0.385 0.455 -0.0353
#> 6 6 Training 58.1 297. 206. 0.682 2.42 0.428 0.494 -0.0271
#> 7 7 Training 50.6 278. 187. 0.588 2.19 0.389 0.462 -0.0154
#> 8 8 Training 57.8 299. 207. 0.674 2.43 0.431 0.496 -0.0317
#> 9 9 Training -19.8 275. 188. -0.254 2.22 0.391 0.457 -0.0324
#> 10 10 Training 55.8 297. 210. 0.650 2.46 0.436 0.494 -0.0216
#> 11 11 Training 49.8 279. 190. 0.575 2.22 0.394 0.464 -0.0145
#> 12 12 Training 55.5 299. 211. 0.642 2.47 0.438 0.496 -0.0263
#> 13 13 Training -20.7 274. 185. -0.261 2.18 0.385 0.455 -0.0352
#> 14 14 Training 57.6 297. 206. 0.676 2.42 0.428 0.494 -0.0268
#> 15 15 Training 50.7 278. 187. 0.588 2.19 0.389 0.462 -0.0154
#> 16 16 Training 57.4 299. 208. 0.670 2.43 0.431 0.497 -0.0316
Created on 2023-01-03 with reprex v2.0.2