Hi,
You cannot compare strings like that using intersect or setdiff.
"3069 XJ" is not the same as "3069 XJ Rotterdam" thus there will be no intersection and everything will be different. It's not clear what your goal is here, as the SD coefficient is based on similarities between list, but I don't see which lists you are trying to compare here.
You can look at which postcodes have the same city for example, or the number of unique postcodes etc, but for that you'd first need to transform your data. For example:
library(tidyverse)
myData = data.frame(
stringsAsFactors = FALSE,
postcode = c("3069 XJ","3076 BJ","3037 EA",
"3043 KC","3031 AM","3039 ZK"),
postcode_city = c("3069 XJ Rotterdam","3076 BJ Rotterdam","3037 EA Rotterdam",
"3043 KC Rotterdam","3031 AM Rotterdam","3039 ZK Rotterdam")
)
myData
#> postcode postcode_city
#> 1 3069 XJ 3069 XJ Rotterdam
#> 2 3076 BJ 3076 BJ Rotterdam
#> 3 3037 EA 3037 EA Rotterdam
#> 4 3043 KC 3043 KC Rotterdam
#> 5 3031 AM 3031 AM Rotterdam
#> 6 3039 ZK 3039 ZK Rotterdam
myData %>% separate(postcode, c("postcode", "abbr")) %>%
mutate(city = str_remove(postcode_city, "^\\d+\\s\\w+\\s")) %>%
select(-postcode_city)
#> postcode abbr city
#> 1 3069 XJ Rotterdam
#> 2 3076 BJ Rotterdam
#> 3 3037 EA Rotterdam
#> 4 3043 KC Rotterdam
#> 5 3031 AM Rotterdam
#> 6 3039 ZK Rotterdam
Created on 2022-02-04 by the reprex package (v2.0.1)
Now you can do more analyses based on any of the 3 variables. Please explain a bit more about what you like to do if needed.
Hope this helps,
PJ