I need get the join as below, detecting even the partial matching, maybe with some "dist technique", and I need aswell a score variable to indicate the score identification. If the match is perfect, indicate 100, if is a not perfect match, indicate it as less that 100..
df1<- data.frame(var1= 1:1, Type= c("megane business hiter"))
df2<- data.frame(var1= 1:3, Type= c("megane business hiter",
"megan businss limited",
"meganee busi lim."))
I need a result as below:
df_Result<- data.frame(Type.x= c("megane business hiternaer","megane business hiternaer",
"megane business hiternaer"), Type.y = c("megane business hiter",
"megan businss limited",
"meganee busi lim."), score_result = c("100 (full indentification)", "less that 100, eg:75", "less than 100, eg: 45"))
The Jaro distance (method='jw', p=0), is a number between 0 (exact match) and 1 (completely dissimilar) measuring dissimilarity between strings. It is defined to be 0 when both strings have length 0, and 1 when there are no character matches between a and b. Otherwise, the Jaro distance is defined as 1-(1/3)(w_1m/|a| + w_2m/|b| + w_3(m-t)/m). Here,|a| indicates the number of characters in a, m is the number of character matches and t the number of transpositions of matching characters. The w_i are weights associated with the characters in a, characters in b and with transpositions. A character c of a matches a character from b when c occurs in b, and the index of c in a differs less than \max(|a|,|b|)/2 -1 (where we use integer division) from the index of c in b. Two matching characters are transposed when they are matched but they occur in different order in string a and b.
I know I reversed the score a purpose.
Thanks for the text from R helper, I saw that.
Can you please create a script similiar like I have created, using the "lv" method?
Thank
What have you tried so far? what is your specific problem?, we are more inclined towards helping you with specific coding problems rather than doing your work for you.
Hi Nirgrahamuk,
Really thank you for that.
There was something in my R Code that stuck when I tried to use the lv method.
Thanks again for your big support.
Cheers,