Columns combination

Hi,

I'm looking for a dplyr code which utilizes data of the following structure

#> # A tibble: 5 × 5
#>   ID    value_A value_B value_C value_D
#>   <chr> <lgl>   <lgl>   <lgl>   <lgl>  
#> 1 ID_1  TRUE    FALSE   TRUE    FALSE  
#> 2 ID_2  FALSE   TRUE    TRUE    FALSE  
#> 3 ID_3  FALSE   FALSE   TRUE    FALSE  
#> 4 ID_4  TRUE    TRUE    FALSE   FALSE  
#> 5 ID_5  FALSE   FALSE   FALSE   TRUE

I would like to get combinations of the colums which yields highest 'TRUE ratio' (union). Let say I want a combination of twho colums which combined have the highest number of TRUE in each row. For instance, a combination of columns 'value_A' & 'value_B' have 'TRUE ratio' of 3.

I'd like to get ordered combinations of colums accoding to 'TRUE ratio' for x number of columns.

First, an ordered combinations of 2 columns.....
Second, an ordered combinations of 3 columns.....
and so on.

Any ideas? I will appreciate your thoughts.

Many thanks,

Jakub

Finding the ratios is easy enough, but I'm uncertain what it is that is to be done with them.

d <- data.frame(
          V1 = c(TRUE, FALSE, FALSE, TRUE, FALSE),
          V2 = c(FALSE, TRUE, FALSE, TRUE, FALSE),
          V3 = c(TRUE, TRUE, TRUE, FALSE, FALSE),
          V4 = c(FALSE, FALSE, FALSE, FALSE, TRUE)
)
(m <- as.matrix(d))
#>         V1    V2    V3    V4
#> [1,]  TRUE FALSE  TRUE FALSE
#> [2,] FALSE  TRUE  TRUE FALSE
#> [3,] FALSE FALSE  TRUE FALSE
#> [4,]  TRUE  TRUE FALSE FALSE
#> [5,] FALSE FALSE FALSE  TRUE
# ratios by row
(r <- rowSums(m)/dim(m)[2])
#> [1] 0.50 0.50 0.25 0.50 0.25

# ratios by column
(clm <- colSums(m)/dim(m)[1])
#>  V1  V2  V3  V4 
#> 0.4 0.4 0.6 0.2

Created on 2023-02-27 with reprex v2.0.2

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.