Sidenote: those ID numbers are reeeeaaaaaally long (19 digits, I count). So long, in fact, that R's `integer`

type probably can't hold them, and the `double`

class (which is *definitely* not appropriate for storing ID numbers) can only represent them to about 15 to 17 significant figures:

The 53-bit significand precision gives from 15 to 17 significant decimal digits precision.

If `read_csv`

is reading these ID numbers in as doubles, you're likely losing some of those figures in the import process, which could explain why you're getting so few duplicates. I would've thought you'd get an error or warning if that was happening, but it's hard to tell without seeing what the data frame, read into R, looks like.

If you can't make these ID numbers smaller and you need to use them to find duplicates, maybe have `read_csv`

read them in as strings using them `col_types`

argument

**EDIT:** Excel also uses double precision, so if you read your data into Excel and let Excel assume this column is numeric, it'll also probably ruin the numbers.