I am using the distinctive() function to remove duplicate rows in my dataframe. Although when I use a similar function in excel it is removing a different number of duplicates than in rstudio. Would anyone know why this is happening or which program is more accurate to use in this scenario?
Welcome to the community!
If I want to remove duplicate rows from a
dataframe in R, I use
I'm not aware of a function called
distinctive in base R, and a quick Google search was in vain. Can you please mention the package where does this function come from?
It'll also be very helpful if you please share a small part of the data set (say
df), and different results you obtain (say
df_R) in a copy-paste friendly format.
In case you don't know how to do it, there are many options, which include:
Thank you for your reply!
Distinctive comes from the package tidyverse. Although, I just used base::unique to remove duplicate rows and it came up with the same result as distinctive.
Unfortunately I cannot share a small part of the data set. Although, if it helps I am working with a very large data set around 127000 rows and 44 columns. When I complete the removal of duplicates in excel it leaves me with 101275 rows but whereas when I complete it in Rstudio it leaves me with 101636 rows.
I cannot work out which one is correct. I understand if it is too difficult to help without the data set.
Not quite true, the function is called
distinct() not "distinctive" and comes from
dplyr package, which is part of the
About your issue, if I was you, I would import the unique values from excel and perform an anti-join with the result of
distinct() in R, that way I could take a look to the difference.
Sorry I am new to R so still trying to understand all the terms and everything!
Thanks so much for your help! Using anti-join worked perfectly and I worked out where I went wrong.
If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.