Identify duplicates based on 3 variables

Hi - I want to identify duplicates - without necessarily removing them- based on three character variables: district_sanitaire, grappe_no, hh_num. The current code I have is:

df_household_level<- df_household_level %>%
group_by(district_sanitaire, grappe_no, hh_num) %>%
mutate(dupe = n()>1)

But it doesn't look like this necessarily identifies all double entries for those three variables. Any suggestions?

That should identify groups with two or more rows. Can you show an example with your data where the output is not what you expected?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.