Hello everyone,
I am new to R and not sure if this question will be naive. What I want to achieve in my data is firstly do the matching based on age, gender and highest degree (using left_join in dplyr). Then, as there are some variables that don't have a corresponding value in the right data frame, for those data that "p_id == NA", we want to do the matching based on only age and gender attributes. You can find below my code (which is not working to achieve the function I describe). I am wondering if you know why I am getting the warning "the condition has length >1 and only the first element will be used"? If so, can you offer me some hints to achieve the function I wish to do? Thanks in advance
matching_function <- function(demobel_df, monitor_data){
matching_df <- monitor_data %>% select(p_id, AgeGroupMethodology, Gender, AgeExact, HighestDegreeCat, WegingPop)
demobel_matched <- left_join(demobel_df, matching_df, by = c("ageGroup" = "AgeGroupMethodology", "genderN" = "Gender", "degree" = "HighestDegreeCat"))
if (is.na(demobel_matched$p_id)) {
demobel_matched <- left_join(demobel_matched, matching_df, by = c("ageGroup" = "AgeGroupMethodology", "genderN" = "Gender"))
} else {
demobel_matched <- demobel_matched
}
demobel_matched$ageDiff <- abs(demobel_matched$age - demobel_matched$AgeExact)
#Order first according to id, then weight, then ageDiff
demobel_matched<- demobel_matched[order(demobel_matched[,'personID'],demobel_matched[,'WegingPop'], demobel_matched[,'ageDiff']),]
return(demobel_matched)
}
demobel_matched <- matching_function(demobel_adults, adult_individuals)