I have two data frame, this is just a sample , database have approx 1 million of records.
data1<-data.frame(
'External ID' = c(86364,"ARV_2612","AGH_2212","IND_2622","CHG_2622"),
sector = c(3,3,1,2,5),
col1=c(1,1,0,0,0),
'Enternal code'=c(1,1,1,1,3),
col3=c(1,1,0,0,0),
col4=c(1,0,0,0,0),
col5=c(1,0,1,1,1)
)
data2<-data.frame(
'External ID' = c(53265,"ARV_7362",76354,"IND_2622","CHG_9762"),
sector = c(3,3,1,2,5),
col1=c(1,1,0,0,0),
'Enternal code'=c(1,1,1,1,3),
col3=c(1,1,0,0,0),
col4=c(1,0,0,0,0),
col5=c(1,0,1,1,1)
)
new to R,now i am looking for a approach to mutate my one data frame (data2) like. the code while will automatically find the column "External ID", then add a new column in data2 like duplicate and this column will show Y or N if the external_id in data2 present in data1(external ID) any simple solution....?? the output should be like
| External.ID |
sector |
col1 |
Enternal.code |
col3 |
col4 |
col5 |
duplicate |
| 53265 |
3 |
1 |
1 |
1 |
1 |
1 |
N |
| ARV_7362 |
3 |
1 |
1 |
1 |
0 |
0 |
N |
| 76354 |
1 |
0 |
1 |
0 |
0 |
1 |
N |
| IND_2622 |
2 |
0 |
1 |
0 |
0 |
1 |
Y |
| CHG_9762 |
5 |
0 |
3 |
0 |
0 |
1 |
N |