Best object for very fast lookups?

I have approx 170,000 unique integers that map to 2,000 unique integers. I want to be able to query each of the 170,000 integers to get its matching value in the list of 2,000.

What's the best container in R to do this with? I'm fine paying a high price of creating the object as long as lookups are fast.

Currently using data.table's setkey function here: https://appsilon.com/fast-data-lookups-in-r-dplyr-vs-data-table/

1 Like

I use data.table as my default now because of the speed.
I don't know if it's using a binary search tree under the hood.

You can also consider using environments as a hash table.

Yes it seems like the data.table solution works well. I didn't experiment with a hash table though. It turns out I only have 13,000 unique values on the LHS after filtering so its manageable.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.