Hi,
I have a data set with some columns. Below is a sample of example data and code:
first <- c("02/02/1987", "03/12/2020", "03/12/2023", "04/07/1999","03/02/2020", "02/12/2019", "09/09/2019", "12/12/1988")
second <- c("02/02/1989", "03/12/2016", "03/12/2020", "08/12/2020", "02/12/2019", "09/09/2019", "12/12/1988", "10/04/1980")
df <- data.frame(first, second)
paste0(round(100*length(intersect(df$second, df$first))/nrow(df)), "%")
This code gave the right percentage for this example. But when I tried it on the original data, it gave a percentage much lower than what it was supposed to be, like 22% for almost identical columns.
I'm not sure why is this happening. Maybe because I'm dealing with dates in the original data! This is the only difference I can think of. Any ideas?
I really appreciate any help.
Thank you