I have a data frame of people, one per row. The task I'm struggling with is to assign two reviewers to each person, satisfying these constraints:
- balance the number of reviews for each reviewer,
- each row in
people
might have a "needed reviewer", need to assign reviewers such that the pair of reviewers assigned to a task includes the needed reviewer.
library(tidyverse)
people <- tribble(
~name, ~needed_reviewer,
"Person 1", "Reviewer 1",
"Person 2", "Reviewer 3",
"Person 3", NA_character_,
"Person 4", NA_character_,
"Person 5", NA_character_,
"Person 6", NA_character_
)
reviewers <- c("Reviewer 1", "Reviewer 2", "Reviewer 3")
reviewer_pairs <- combn(reviewers, 2, simplify = F) %>%
# obtain sufficient repeats, rep_len gives a target length
rep_len(nrow(files))
# now a list of combinations of reviewer pairs, same length as people.
So now I have two tables and I want to match rows up. My desired output is something like the table below (but reviewer_1 and reviewer_2_ could alternatively be a list column, assigned_reviewers):
~name, ~reviewer_1, ~reviewer_2
"Person 1", "Reviewer 1", "Reviewer 2", # matches needed_reviewer constraint
"Person 2", "Reviewer 1", "Reviewer 3", # matches needed_reviewer constraint
"Person 3", "Reviewer 2", "Reviewer 3", # no constraint
"Person 4", "Reviewer 1", "Reviewer 2", # no constraint
"Person 5", "Reviewer 1", "Reviewer 3", # no constraint
"Person 6", "Reviewer 2", "Reviewer 3" # no constraint
But it's not a join (or at least not one that I can see!). Before I had the "needed reviewer" constraint I was using cbind
to "tack" the reviewer pair columns onto the table, but that doesn't work once I need to match rows up.
If I was working in something like Python I might iterate through the reviewer_pairs
, popping each one off and finding a place in the files
table (changing it in place as some sort of global variable). Seems to need some sort of "find and update shared table"? I tried something along those lines with purrr::map
but, at least in the tidyverse, the idea of "updating" a shared table isn't part of the paradigm, if I understand correctly.
I tried having a function that found and returned a matched row, and using purrr:map
to run that for each reviewing_pair, but since each item in the list gets the unalterned files
rows in files
would be assigned more than once. Maybe I need something that passes unassigned rows to each iteration. Is that possible with the functional programming approach of map
?
I'm thinking that I've horribly over-complicated this? Does anyone have suggestions for "distributing" rows from one table across another table, according to conditions? Perhaps I need to move to some sort of constraint programming package? Any suggestions? I find them all pretty unapproachable, but I'd enjoy learning them if there were some approachable tutorials