I am relatively new to R and starting on a larger dataframe.
I have a basic question on cleaning my data. Several of my columns have the same variable with slightly different spelling (i.e. "red" and "RED"). Sometimes there are 3-4 variations of the same variable.
What is the best way to consolidate all the different spelling to just "red" for instance.
For misspelling, string distance can help and you have ?agrep.
It is estimation only and not sure 100% but you can great something to get a confidence in the misspelling. I let you look into it. stringdist can also help maybe.