This will run based on what you've provided (there is no ID column in LL so it needs renaming:
LL |>
tibble() |>
rename(ID = customerID) |>
mutate(ID = as.character(ID)) |>
left_join(select(boga, ID, meanLead), by = "ID") |>
left_join(select(ring_fence, ID, rf_trader = trader) |> mutate(ID = as.character(ID)), by = "ID") |>
left_join(select(SAM, ID, sam_trader = trader) |> mutate(ID = as.character(ID)), by = "ID") |>
mutate(
rule = case_when(
!is.na(meanLead) ~ "BOGA",
!is.na(rf_trader) ~ "RF",
!is.na(sam_trader) ~ "SAM",
TRUE ~ "0"
),
lead = case_when(
!is.na(meanLead) ~ meanLead,
!is.na(rf_trader) ~ rf_trader,
!is.na(sam_trader) ~ sam_trader,
TRUE ~ "ML"
)
) |>
select(-meanLead, -rf_trader, -sam_trader)
To unpack:
-
tibble() just makes the dataframe easier to look at as the printing method is better than the default data.frame. You can drop it if you'd like.
-
|> is the base pipe. You can use %>% instead, in this case they're interchangeable.
-
Everything untill the last left_join() is joining on the other tables by the "ID" column. It is important that they're all the same data type, so I set them all to be characters.
-
Inside the mutate() I use case_when(), which is a vectorised if_else(). The first one says "if there is a meanLead, set rule to BOGA. Else, if there is an rf_trader, set ruleto RF. Else, if there is a sam_trader, set the rule to SAM. Otherwise just set it to 0.
-
Then I drop the meanLead, rf_trader and sam_trader columns as they are no longer needed.