Is there a package in R that can (at a minimum) make an educated guess as to splitting first name and last name from an email address? I have this; firstname.lastname@example.org. I can eliminate the @samething.org bit from all, but will be left with johnsmith which I would really not like to manually separate a few hundred addresses' into first name and last name if I can help it. Any thoughts or suggestions?
Here, I think the only way is to use a reference database of first names to match against. I've seen these as text files online.
Here's a bit of inspirational code @mmahoney :
library("stringr") db = c("dan", "john", "kyle") s = str_replace(string = "johnsmith", pattern = db, replacement = "") i = which.min(nchar(s)) n = c(db[i], s[i])
> n  "john" "smith"
Hope it helps
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.