How to identify values meeting X criteria?

Hi.

I'm trying to identify values, if there are any, which contain more than one unique species of a particular BIN number. As you can see in my simplified code below, there are plenty of duplicate but in one case (BIN: BOLD:AAA2226) there are actually two different species attached to this BIN. I'm trying to filter, or select, those samples where one BIN number has more than 1 unique species, which again in this case would only be BOLD:AAA2226 (Pieris rapae and Pieris affinis).

Any suggestions how to solve that?

data.frame(
stringsAsFactors = FALSE,
BIN = c("BOLD:ACS5043","BOLD:ADJ8787",
"BOLD:ADJ8787","BOLD:ADJ8787","BOLD:ACA1612",
"BOLD:ACA1612","BOLD:AAA2226","BOLD:AAA2226","BOLD:AAA2226"),
Species = c("Zemeros flegyas",
"Dodona adonira","Dodona adonira","Dodona adonira",
"Prioneris thestylis","Prioneris thestylis","Pieris rapae",
"Pieris rapae","Pieris affinis")
)

How about this? Where df is your dataset and the column n is the count of BIN.

library(tidyverse)
df %>% 
  distinct(BIN, Species) %>% 
  add_count(BIN)


           BIN             Species n
1 BOLD:ACS5043     Zemeros flegyas 1
2 BOLD:ADJ8787      Dodona adonira 1
3 BOLD:ACA1612 Prioneris thestylis 1
4 BOLD:AAA2226        Pieris rapae 2
5 BOLD:AAA2226      Pieris affinis 2
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.