Doing summary with a pairwise table

Hi everyone!

I have a table of that sort:

ID1 ID2 relatedness
A B 0.5
A C 0.8
B C 0.7

I would like to make a summary for each ID (A, B, C) of the relatedness values.
I was of course thinking about using summarise function, to get min, max, average etc. for each accession.

The issue is that if I group_by ID1, then for A, I would indeed have the two values of relatedness used for summarizing.
But for B, only the last value of the relatedness from the above table would be used, because it's other value (with A) is on the first line.

What is the way to tell to group by both ID1 and ID2 so that for calculating the summary of relatedness for each accessions, it would check in both ID1 and ID2 columns?



This is what I understand from your explanation


sample_df <- data.frame(ID1 = c("A", "A", "B"),
                        ID2 = c("B", "C", "C"),
                        relatedness = c(0.5, 0.8, 0.7))

sample_df %>% 
    gather(ID_num, Value, -relatedness) %>% 
    group_by(Value) %>% 
    summarise(sum_relatednes = sum(relatedness))
#> Warning: attributes are not identical across measure variables;
#> they will be dropped
#> # A tibble: 3 x 2
#>   Value sum_relatednes
#>   <chr>          <dbl>
#> 1 A                1.3
#> 2 B                1.2
#> 3 C                1.5

Created on 2020-03-09 by the reprex package (v0.3.0.9001)

Thanks for your reply!
@andresrcs got how to do!

Thanks very much @andresrcs, that's it!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.