Hi von_olaf,
The 10 rows that you have requested have duplicates as well. Could you explain on how you will use duplicates in your actual application?
I have put up the following code which will provide you with an output that will be the superset of all 3 data sets as well as one with rows matching key values from masterdf
Superset
masterdf <- tibble(key = c('a', 'a', 'b'))
db1 <- tibble(key = c('a', 'a', 'c'),
value = c(1,2,3))
db2 <- tibble(key = c('a', 'a', 'c'),
value = c(10, 20, 30))
DF<-full_join(x=masterdf,y=db1,by="key") %>%
full_join(y=db2,by="key") %>%
mutate("master_key"=paste0(key,value.x,value.y),
"dupl_row"=duplicated(master_key)) %>%
filter(dupl_row!=TRUE) %>%
select(-master_key,-dupl_row)
print(DF)
Output of the above code
A tibble: 6 x 3
key value.x value.y
<chr> <dbl> <dbl>
1 a 1 10
2 a 1 20
3 a 2 10
4 a 2 20
5 b NA NA
6 c 3 30
Only the values from the key of masterdf
masterdf <- tibble(key = c('a', 'a', 'b'))
db1 <- tibble(key = c('a', 'a', 'c'),
value = c(1,2,3))
db2 <- tibble(key = c('a', 'a', 'c'),
value = c(10, 20, 30))
DF<-full_join(x=masterdf,y=db1,by="key") %>%
full_join(y=db2,by="key") %>%
mutate("master_key"=paste0(key,value.x,value.y),
"dupl_row"=duplicated(master_key)) %>%
filter(dupl_row!=TRUE) %>%
select(-master_key,-dupl_row) %>%
filter(key %in% c('a', 'a', 'b'))
print(DF)
Output of the above code
# A tibble: 5 x 3
key value.x value.y
<chr> <dbl> <dbl>
1 a 1 10
2 a 1 20
3 a 2 10
4 a 2 20
5 b NA NA
Warm Regards,
Pritish