@ben-e Is your problem solved, or are you still searching for a more general solution.
I thought that sometimes it is good to "physically" create the data, if it doesn't become too big.
Things will be easier to test and errors become more obvious. Also code written on top of it might become easier (maybe some tidyr magic for the many models...).
So the idea is to create columns for each variable combination that can occur.
This can give quite a lot of headache, but if we just combine the names via utils::combn(),
it turns out that this is a cool example for non-standard evaluation:
- create the expressions as a string (expression will be a combination of variable names and a binary numeric or logical operator like
"+", "*", "&" or "|")
- create the names (since expressions might be bad names)
- evaluate the expressions in the context of your data and assign it to a new object (and name it of course)
Maybe one can adjust the function to take arbitrary integer numbers/sequences for m, to calculate only combinations of specific orders.
# ____________________________________________________________________________
# Load libraries
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(purrr))
# ____________________________________________________________________________
# create test data
set.seed(123)
df_test <- tibble(a = sample(0:1,6,T),
b = sample(0:1,6,T),
c = sample(0:1,6,T),
d = sample(0:1,6,T))
# ____________________________________________________________________________
# Function combine_binary
# data: data.frame with binary entries (logical or coercable to logical)
# bin_op: string, binary operator
# sep: string, used to combine names of binary operators
combine_binary <- function(data, bin_op, sep) {
## ............................................................................
# create combinations
comb <- seq_along(data) %>%
map(~ combn(names(data), m = .x)) %>%
map(~ as_tibble(.x))
# create expressions to evaluate
comb_expressions <- comb %>%
map(~ map_chr(.x, ~ paste(.x, collapse = bin_op))) %>%
flatten_chr()
# create names for expressions
comb_names <- comb %>%
map(~ map_chr(.x, ~ paste(.x, collapse = sep))) %>%
flatten_chr()
## ............................................................................
# validate expressions, set names,assign to tibble
# and return
map(comb_expressions,
~ eval(parse(text = .x), envir = data)) %>%
map_if(is_logical, as.integer) %>%
set_names(comb_names) %>%
as_tibble()
}
# ____________________________________________________________________________
And test it:
combine_binary(df_test, bin_op = " & ", sep = "_and_")
#> # A tibble: 6 x 15
#> a b c d a_and_b a_and_c a_and_d b_and_c b_and_d c_and_d
#> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#> 1 0 1 1 0 0 0 0 1 0 0
#> 2 1 1 1 1 1 1 1 1 1 1
#> 3 0 1 0 1 0 0 0 0 1 0
#> 4 1 0 1 1 0 1 1 0 0 1
#> 5 1 1 0 1 1 0 1 0 1 0
#> 6 0 0 0 1 0 0 0 0 0 0
#> # ... with 5 more variables: a_and_b_and_c <int>, a_and_b_and_d <int>,
#> # a_and_c_and_d <int>, b_and_c_and_d <int>, a_and_b_and_c_and_d <int>
combine_binary(df_test, bin_op = " | ", sep = "_or_")
#> # A tibble: 6 x 15
#> a b c d a_or_b a_or_c a_or_d b_or_c b_or_d c_or_d
#> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#> 1 0 1 1 0 1 1 0 1 1 1
#> 2 1 1 1 1 1 1 1 1 1 1
#> 3 0 1 0 1 1 0 1 1 1 1
#> 4 1 0 1 1 1 1 1 1 1 1
#> 5 1 1 0 1 1 1 1 1 1 1
#> 6 0 0 0 1 0 0 1 0 1 1
#> # ... with 5 more variables: a_or_b_or_c <int>, a_or_b_or_d <int>,
#> # a_or_c_or_d <int>, b_or_c_or_d <int>, a_or_b_or_c_or_d <int>