Remove second instance of matching column names regardless of order

Searched for help but I'm stumped on this one. I would like to remove the second instance where the names from Name1 and Name2 match regardless of order and the values in column ga are the same. I have included two examples below, one tibble with what I have and one tibble with what I want.

library(tidyverse)

# tibble of data current
have <- tibble(
  Name1 = c("Allan", "Allan", "Hanzel", "Prokop",
            "Prokop", "Mynio", "Mynio", "Korchinski"),
  Name2 = c("Hanzel", "Korchinski", "Allan", "Mynio",
            "Korchinski", "Prokop", "Hanzel", "Prokop"),
  ga = c(10, 1, 10, 1, 6, 1, 1, 6)
)

# tibble of data after removing second instance where names from
# Name1 and Name2 match regardless of order and values in ga are equal
want <- tibble(
  Name1 = c("Allan", "Allan", "Prokop", "Prokop", "Mynio"),
  Name2 = c("Hanzel", "Korchinski", "Mynio", "Korchinski", "Hanzel"),
  ga = c(10, 1, 1, 6, 1)
)

Is this close enough?

library(tidyverse)
have <- tibble(
  Name1 = c("Allan", "Allan", "Hanzel", "Prokop",
            "Prokop", "Mynio", "Mynio", "Korchinski"),
  Name2 = c("Hanzel", "Korchinski", "Allan", "Mynio",
            "Korchinski", "Prokop", "Hanzel", "Prokop"),
  ga = c(10, 1, 10, 1, 6, 1, 1, 6)
)


wantTest <- have |> mutate(Name1New = ifelse(Name1 < Name2, Name1, Name2),
                           Name2New = ifelse(Name1 >= Name2, Name1, Name2)) |> 
  select(Name1 = Name1New, Name2 = Name2New, ga) |> distinct()
wantTest
#> # A tibble: 5 × 3
#>   Name1      Name2         ga
#>   <chr>      <chr>      <dbl>
#> 1 Allan      Hanzel        10
#> 2 Allan      Korchinski     1
#> 3 Mynio      Prokop         1
#> 4 Korchinski Prokop         6
#> 5 Hanzel     Mynio          1

Created on 2023-05-25 with reprex v2.0.2

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.