Group same strings but different locations

Hi,

I have a column that contains name list that separated by "-".

Name Value
James-Michael-Mike 2
James-Maria-Andrea 3
Michael-Mike-James 5
James-Mike-Michael 7

I want to make:

Name Value
James-Michael-Mike 14
James-Maria-Andrea 3

Note: Order of names does not matter. It can be also "Michael-Mike-James".

Thank you!!

One approach:

  1. Loop over Names column using map_chr to ensure final value is a character string
  2. split string by '-' using strsplit
  3. sort names into alphabetical order
  4. paste names back together again

library(dplyr)
library(purrr)

d %>% 
  mutate(Name = map_chr(strsplit(Name, '-'),
                        ~paste0(sort(.x), collapse = '-'))) %>%
  group_by(Name) %>% 
  summarize(Value = sum(Value))

#> # A tibble: 2 x 2
#>   Name               Value
#>   <chr>              <int>
#> 1 Andrea-James-Maria     3
#> 2 James-Michael-Mike    14
3 Likes

Thank you very much for your solution!!

I have a similar question...

I want to merge two different DF. When I utilize your solution for two different DF, I can't merge them because order of names are different in two DF (first DF James-Michael-Mike, second DF Michael-James-Mike).

How can I merge two different order name list? Thank you.

Hmm, well, you did specifically mention that name order didn't matter...:slight_smile:

You could either, sort the names on both data frames, or else where I overwrote the name column, you could put it in a new column, then merge on the original.

Otherwise, try posting another question and provide appropriate data

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.