Existing function that will detect equal values across a data row

jtbrown1980 · November 7, 2018, 3:24pm

I'm trying to find a way to remove rows with equal values in all columns for various data frames.
The data frames will never be larger than 500 rows by 10 columns so size is not an issue.

Before and after are below. rows 2 and 5 were dropped in df_after because each row contained a value that was equal in all columns.

Thank you for your assistance on this. I'm still very new to R so any guidance is always welcomed.

df_before <- data.frame("x1" = c("a","b","c","d","e"), "x2" = c("c","b","c","x","e"), "x3" = c("a","b","t","s","e"))

df_after<- data.frame("x1" = c("a","c","d"), "x2" = c("c","c","x"), "x3" = c("a","t","s"))

joels · November 7, 2018, 4:01pm

Here's a base R solution:

df_after = df_before[apply(df_before, 1, function(x) length(unique(x)) > 1), ]

I don't do row-wise operations often enough to remember what the current tidyverse idiom is, but another option is to convert to long format for the filtering and then convert back to wide. The code below seems awfully long to me, so maybe there's a less verbose approach.

df_after = df_before %>% 
  rownames_to_column() %>% 
  gather(key, value, -rowname) %>% 
  group_by(rowname) %>% 
  filter(length(unique(value)) > 1) %>% 
  spread(key, value) %>% 
  ungroup %>% 
  select(-rowname)

A shorter hybrid solution:

df_before %>% 
  filter(apply(., 1, function(x) length(unique(x)) > 1))

jtbrown1980 · November 7, 2018, 4:10pm

Thank you! This helps so much.

system · November 14, 2018, 4:10pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.