I have a data frame of people, one per row. The task I'm struggling with is to assign two reviewers to each person, satisfying these constraints:
- balance the number of reviews for each reviewer,
- each row in
peoplemight have a "needed reviewer", need to assign reviewers such that the pair of reviewers assigned to a task includes the needed reviewer.
library(tidyverse) people <- tribble( ~name, ~needed_reviewer, "Person 1", "Reviewer 1", "Person 2", "Reviewer 3", "Person 3", NA_character_, "Person 4", NA_character_, "Person 5", NA_character_, "Person 6", NA_character_ ) reviewers <- c("Reviewer 1", "Reviewer 2", "Reviewer 3") reviewer_pairs <- combn(reviewers, 2, simplify = F) %>% # obtain sufficient repeats, rep_len gives a target length rep_len(nrow(files)) # now a list of combinations of reviewer pairs, same length as people.
So now I have two tables and I want to match rows up. My desired output is something like the table below (but reviewer_1 and reviewer_2_ could alternatively be a list column, assigned_reviewers):
~name, ~reviewer_1, ~reviewer_2 "Person 1", "Reviewer 1", "Reviewer 2", # matches needed_reviewer constraint "Person 2", "Reviewer 1", "Reviewer 3", # matches needed_reviewer constraint "Person 3", "Reviewer 2", "Reviewer 3", # no constraint "Person 4", "Reviewer 1", "Reviewer 2", # no constraint "Person 5", "Reviewer 1", "Reviewer 3", # no constraint "Person 6", "Reviewer 2", "Reviewer 3" # no constraint
But it's not a join (or at least not one that I can see!). Before I had the "needed reviewer" constraint I was using
cbind to "tack" the reviewer pair columns onto the table, but that doesn't work once I need to match rows up.
If I was working in something like Python I might iterate through the
reviewer_pairs, popping each one off and finding a place in the
files table (changing it in place as some sort of global variable). Seems to need some sort of "find and update shared table"? I tried something along those lines with
purrr::map but, at least in the tidyverse, the idea of "updating" a shared table isn't part of the paradigm, if I understand correctly.
I tried having a function that found and returned a matched row, and using
purrr:mapto run that for each reviewing_pair, but since each item in the list gets the unalterned
files rows in
files would be assigned more than once. Maybe I need something that passes unassigned rows to each iteration. Is that possible with the functional programming approach of
I'm thinking that I've horribly over-complicated this? Does anyone have suggestions for "distributing" rows from one table across another table, according to conditions? Perhaps I need to move to some sort of constraint programming package? Any suggestions? I find them all pretty unapproachable, but I'd enjoy learning them if there were some approachable tutorials