R-Code - Check duplication of a text column

Hi everyone,
I am new to R and need to do my excersise as below:

  1. I have a customer file: Code, Name, Address, etc...with ~ 500 records
  2. In the Name column, there are some similar row such as Austern Keller, Austernkeller, Aussterkeller...
  3. Then I need to know these rows that have the similar name
    Thank you
    David Doan

Hi David, welcome!

You can use the tidystringdist package, take a look at this similar thread.

If you need more specific help please provide a minimal REPRoducible EXample (reprex). A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

Hi Andresrcs,
thank you so much. I apply it to my scenario sucessfully. However, is there anyway to group similar rows together with a new column such as:
Name ID Group ID
Alster Au 19 1
AlsterAu 77 1
Aussterkeller 8 2
Austern Keller 44 2
Austernkeller 18 2

Thanks
David Doan

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.