Comparing multiple columns between different rows

Hello,

I want to compare data between rows (from excel, if possible) based on multiple-column parameters. I want to essentially loop through each row so that each row is compared to every other row. For example, if values between rows in the Analyte, Location ID, and Comments columns are the same, then the value for the Result column must also be the same. If the value for RESULT is different, I would like to flag the rows by appending a column titled "Conflict" indicating a conflicting result as a "yes". Using this logic, Row#s 1, 2, 3, & 4 would be highlighted in my example data below.

'Row#' 'Analyte' 'Location ID''Result' 'Comments'
1 EC 5219 100 Maximum
2 DO 5219 9
3 EC 5219 200 Maximum
4 DO 5219 7
5 As 2898 0.01 Dissolved
6 As 4739 0.01 Dissolved
7 Ba 2898 0.1 Dissolved
8 As 2898 0.01 Dissolved

I figure this is possible but I am unfamiliar with how to accomplish this in RStudio. This is also my first post so I apologize if it is lacking in any way.

Thanks for your help!

Welcome to the community @randyg! Below is one approach I believe gets to your desired result. It groups by the columns that should be the same and tests for more than one distinct Result value in the group. If more than one exists, Conflict is set to "yes".

library(tidyverse)

df = tribble(
  ~'Row#', ~'Analyte', ~'Location ID', ~'Result', ~'Comments',
  1, 'EC', 5219, 100, 'Maximum',
  2, 'DO', 5219, 9, '',
  3, 'EC', 5219, 200, 'Maximum',
  4, 'DO', 5219, 7, '',
  5, 'As', 2898, 0.01, 'Dissolved',
  6, 'As', 4739, 0.01, 'Dissolved',
  7, 'Ba', 2898, 0.1, 'Dissolved',
  8, 'As', 2898, 0.01, 'Dissolved'
)

outcome = df %>%
  group_by(Analyte, `Location ID`, Comments) %>%
  mutate(Conflict = ifelse(n_distinct(Result) > 1, 'yes', 'no')) %>%
  ungroup()

outcome
#> # A tibble: 8 × 6
#>   `Row#` Analyte `Location ID` Result Comments    Conflict
#>    <dbl> <chr>           <dbl>  <dbl> <chr>       <chr>   
#> 1      1 EC               5219 100    "Maximum"   yes     
#> 2      2 DO               5219   9    ""          yes     
#> 3      3 EC               5219 200    "Maximum"   yes     
#> 4      4 DO               5219   7    ""          yes     
#> 5      5 As               2898   0.01 "Dissolved" no      
#> 6      6 As               4739   0.01 "Dissolved" no      
#> 7      7 Ba               2898   0.1  "Dissolved" no      
#> 8      8 As               2898   0.01 "Dissolved" no

Created on 2022-12-16 with reprex v2.0.2.9000

1 Like

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.