Filter and extract features with values passing the threshold

Hi,

I am working with a R dataframe. I am interested in extracting features with values passing the threshold. For instance; extracting only features passing the threshold that is value >= 50 across all columns. I have shown input and expected dataset below. One way was using the tidyverse filter by each column manually, but there are hundreds of columns then typing each column name would be tedious. Furthermore, each column starts and ends with unique names. Is there a way to revise the below formula or straightforward method to perform this?

Thank you,
Toufiq

Input

dput(Data)
structure(list(Col1_Counts = c(100L, 10L, 2000L, 0L, 2000L, 0L, 
                               11L, 15L, 19L, 0L, 100L, 50L, 10L, 100L, 50L), CSC1_Counts = c(150L, 
                                                                                              50L, 150L, 3L, 50L, 0L, 12L, 16L, 20L, 23L, 1000L, 50L, 10L, 
                                                                                              50L, 50L), BC_Counts = c(50L, 75L, 100L, 10L, 75L, 0L, 13L, 17L, 
                                                                                                                       21L, 0L, 100000L, 40L, 10L, 100000L, 50L)), class = "data.frame", row.names = c("Feature_1", 
                                                                                                                                                                                                       "Feature_2", "Feature_3", "Feature_4", "Feature_5", "Feature_6", 
                                                                                                                                                                                                       "Feature_7", "Feature_8", "Feature_9", "Feature_10", "Feature_11", 
                                                                                                                                                                                                       "Feature_12", "Feature_13", "Feature_14", "Feature_15"))
#>            Col1_Counts CSC1_Counts BC_Counts
#> Feature_1          100         150        50
#> Feature_2           10          50        75
#> Feature_3         2000         150       100
#> Feature_4            0           3        10
#> Feature_5         2000          50        75
#> Feature_6            0           0         0
#> Feature_7           11          12        13
#> Feature_8           15          16        17
#> Feature_9           19          20        21
#> Feature_10           0          23         0
#> Feature_11         100        1000    100000
#> Feature_12          50          50        40
#> Feature_13          10          10        10
#> Feature_14         100          50    100000
#> Feature_15          50          50        50

Expected Output:


library(tidyverse)
Data %>% 
  filter(Col1_Counts >=50 & CSC1_Counts >=50 & BC_Counts >=50)

           Col1_Counts CSC1_Counts BC_Counts
Feature_1          100         150        50
Feature_3         2000         150       100
Feature_5         2000          50        75
Feature_11         100        1000    100000
Feature_14         100          50    100000
Feature_15          50          50        50

Created on 2023-02-12 with reprex v2.0.2

Below is one approach to achieve the desired output.

Data %>%
  filter(if_all(everything(), ~ . >= 50))
#>            Col1_Counts CSC1_Counts BC_Counts
#> Feature_1          100         150        50
#> Feature_3         2000         150       100
#> Feature_5         2000          50        75
#> Feature_11         100        1000    100000
#> Feature_14         100          50    100000
#> Feature_15          50          50        50

Created on 2023-02-12 with {reprex v2.0.2.9000](https://reprex.tidyverse.org/)

2 Likes

@scottyd22 thank you very much. This is indeed helpful.