I'm trying to learn how to use purrr
. I'm trying to not use a for loop.
I have two dataframes. One with IDs, and another one is a reference.
For each ID, I would like to get the highest item
in the reference that the ID does not already have. To explain:
df_id = data.frame(id = c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
item = c('x', 'y', 'x', 'z', 'z', 'y', 'z'))
df_ref = data.frame(item = c('w', 'x', 'y', 'z'),
count = c(0, 2, 3, 1))
# The result I would like
df_top = data.frame(id = c('a', 'b', 'c'),
item = c('z', 'y', 'x'))
Essentially, id = a
owns y
and x
already. So the items that id = a
can get are z
and w
, in that order. Because z
has higher count, df_top
gives z
to id = a
.
Similarly, id = b
owns x
and z
already. This means that y
and w
can be given, so y
is given.l
Please let me know if this makes sense.