How could I populate a variable based on conditions in another dataframe?

Hi,

I'm trying to find a more efficient solution to some code I wrote using case_when from dplyr.

My goal with it was to populate a column I called 'VISIT_DAY' with either "YES" or "NO". The conditions are based on dates matching the first dataframe to the second by ID. I've attached the code for it below (the data used is in the 1000's so I don't think it would be worth pasting a full reprex).

Home5 <- Home3 %>% #Categorising clinic day and non-clinic day readings
  mutate(
    VISIT_DAY = case_when(
    PATIENT_ID == "-2147483646" & DATE %in% dmy(c("06-08-2018", "20-08-2018", "01-10-2018", "04-02-2019","22-07-2019")) ~ "YES",
    PATIENT_ID == "-2147483645" & DATE %in% dmy(c("16-08-2018", "30-08-2018", "11-10-2018", "27-02-2019")) ~ "YES",
    PATIENT_ID == "-2147483644" & DATE %in% dmy(c("06-09-2018", "21-09-2018", "02-11-2018", "01-03-2019", "05-09-2019")) ~ "YES",
    PATIENT_ID == "-2147483643" & DATE %in% dmy(c("21-09-2018", "02-11-2018", "20-03-2019", "06-09-2019")) ~ "YES",
    PATIENT_ID == "-2147483642" & DATE %in% dmy(c("27-09-2018", "12-10-2018", "22-11-2018", "28-03-2019", "26-09-2019")) ~ "YES",
    PATIENT_ID == "-2147483641" & DATE %in% dmy(c("25-10-2018", "08-11-2018", "04-12-2018", "18-12-2018", "18-04-2019")) ~ "YES",
    PATIENT_ID == "-2147483640" & DATE %in% dmy(c("06-11-2018", "22-11-2018", "04-01-2018", "23-05-2019")) ~ "YES",
    PATIENT_ID == "-2147483639" & DATE %in% dmy(c("23-11-2018", "03-01-2019", "07-05-2019")) ~ "YES",
    PATIENT_ID == "-2147483638" & DATE %in% dmy(c("17-01-2019", "31-01-2019", "14-03-2019", "04-07-2019")) ~ "YES",
    PATIENT_ID == "-2147483637" & DATE %in% dmy(c("25-02-2019", "11-03-2019", "15-04-2019", "19-09-2019")) ~ "YES",
    PATIENT_ID == "-2147483635" & DATE %in% dmy(c("12-03-2019", "26-03-2019", "14-05-2019", "24-09-2019")) ~ "YES",
    PATIENT_ID == "-2147483633" & DATE %in% dmy(c("25-04-2019", "08-05-2019", "20-06-2019")) ~ "YES",
    PATIENT_ID == "-2147483632" & DATE %in% dmy(c("20-05-2019", "15-07-2019")) ~ "YES",
    PATIENT_ID == "-2147483631" & DATE %in% dmy(c("29-05-2019", "12-06-2019", "24-07-2019")) ~ "YES",
    PATIENT_ID == "-2147483630" & DATE %in% dmy(c("17-09-2019", "01-10-2019")) ~ "YES",
    TRUE ~ "NO")
    )

I'm going back through my script to tidy it up and make it easily reproducible, so was wondering if anyone would possibly help me in creating a loop to populate it like I have done with case_when.

Many thanks for any help.

Even a dput(your_data[1:10] cut and pasted would help. However, I can tell you now that this path will lead to madness.

1 Like

To populate variable based on conditions that begin in another dataframe, you should use a join... to match the frames together. Hardcoding info from the dataframe is a flawed approach, when you can dynamical join to it.
as per technocrats suggestion, if you dput() a sample of both Home3, and whatever dataframe it was that you got the hardcoded values from, it will be relatively straightforward to produce an effective join.

2 Likes