validate data with the help of email_id

i have a data frame like below, now i want to check if name before @ are duplicate, if duplicate then mutate new column to(1,0) for TRUE and FALSE so i am able do it.

df <- data.frame(ID =c("DEV2962","KTN2252","ANA2719","ITI2624","DEV2698","HRT2921","","KTN2624","ANA2548","ITI2535","DEV2732","HRT2837","ERV2951","KTN2542","ANA2813","ITI2210"),
                 email = c("","","","","","","","","","","","","","","",""),
                 name= c("dev,akash","singh,rahul","abbas,salman","lal,ram","singh,nkunj","garg,prabal","ali,sanu","singh,kunal","tomar,lakhan","thakur,praveen","ali,sarman","khan,zuber","singh,giriraj","sharma,lokesh","pawar,pooja","sharma,nikita"))

df <- df %>% 
  mutate(first =str_extract(email, "[^\\@]+"),
         duplicate = as.numeric(duplicated(first))) %>% select(-first)

i also have a same old data frame, to check if mail ID is present in old data frame if present the check all records are same like (name,city,ID)

so i will check my mail id's present in the same old data frame , so if present the i need to validate if all variable like (city,name,ID) is exact match, if not the mutate a new column to true and false.

ID city email name duplicate_name discrepancy
DEV2962 del dev,akash 0 0
KTN2252 mum singh,rahul 0 0
ANA2719 nav abbas,salman 0 0
ITI2624 pun lal,ram 0 0
DEV2698 bang singh,nkunj 1 0
HRT2921 chen garg,prabal 0 0
triv ali,sanu 0 0
KTN2624 vish singh,kunal 0 0
ANA2548 del tomar,lakhan 0 0
ITI2535 mum thakur,praveen 0 1
DEV2732 bang ali,sarman 0 0
HRT2837 vish khan,zuber 0 0
ERV2951 bhop singh,giriraj 0 0
KTN2542 kol sharma,lokesh 0 0
ANA2813 noi pawar,pooja 0 0
ITI2210 gurg sharma,nikita 0 0

Here is an example of how I can find differences between common rows (based on id) of two tables

(data1 <-
         a =LETTERS[23:25],
         b = c(4L,1L,2L)))

(data2 <-
           a =LETTERS[23:26],
           b = 4:1))

#what id's are common between data1 and data2
           by="id")  %>% pull(id))

#find discrepencies
discrep <- FALSE
     ~!isTRUE(all_equal(filter(data1,id %in% .),
                filter(data2,id %in% .))))

# attach them to data1
data1$discrep <- FALSE
data1[which(data1$id %in% commonids),"discrep"] <- discrep

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.