Filter out observations if certain variables contain the same value

country suit_color shirt_color tie_color
A white blue black
B orange blue pink
C white white white
D orange blue pink
E white white white

In the above example, how can I drop the observations that have all "white" under the color-related variables?

library(dplyr)
df <- tribble(
  ~country, ~ suit_color, ~ shirt_color, ~tie_color,
  "A", "white", "blue", "black",
  "B", "orange", "blue", "pink",
  "C", "white", "white", "white",
  "D", "orange", "blue", "pink",
  "E", "white", "white", "white"
) 
library(dplyr)
library(tidyr)
df <- tribble(
  ~country, ~suit_color, ~shirt_color, ~tie_color,
  "A", "white", "blue", "black",
  "B", "orange", "blue", "pink",
  "C", "white", "white", "white",
  "D", "orange", "blue", "pink",
  "E", "white", "white", "white"
)

special_value <- "white"

(df_summary <- df %>%
  pivot_longer(cols = -country) %>%
  group_by(country) %>%
  summarise(s_cnt = sum(value == special_value),
            n = n()) %>%
  filter(s_cnt != n))

(df_result <- left_join(df_summary,
                        df)) %>% select(-s_cnt,-n)
1 Like

Thanks! Inspired by your solution, here is my attempt:

df %>% 
  mutate(wsuit = ifelse(suit_color == "white", 1, 0),
         wshirt = ifelse(shirt_color == "white", 1, 0),
         wtie = ifelse(tie_color == "white", 1, 0),
         wtotal = wsuit + wshirt + wtie) %>% 
  filter(wtotal != max(wtotal)) %>%
  select(- starts_with("w"))
#> # A tibble: 3 x 4
#>   country suit_color shirt_color tie_color
#>   <chr>   <chr>      <chr>       <chr>    
#> 1 A       white      blue        black    
#> 2 B       orange     blue        pink     
#> 3 D       orange     blue        pink

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.