How to update incorrect string based on a correct string

I'm trying to clean a customer dataset that looks like the customers df below. This dataset has many City names spelled incorrectly. I also have another df (Australian_Postcodes ) with the correct spelling of city names in the place_name column. I'm wondering how I update the incorrectly spelled city names in my customers df based on the correctly spelled names in the Australian_Postcodes df.

Many thanks for your help.

customers <- data.frame(
  Name = c("customer1", "customer2", "customer3", "customer4", "customer5", "customer6", "customer7",
           "customer8", "customer9"),
  Address = c("address1", "address2", "address3", "address4", "address5","address6","address7",
              "address8","address9"),
  City = c("Port Norlunga", "Essendn", "Seaford Heigh", "Salisbury","Briar Hill","Elizabeth sa", "WARRADALE SA 5046",
           "Parlawie","M|Vale"),
  State = c("SA", "VIC", "SA", "SA","VIC","SA","SA","SA","SA"),
  Postcode = c(NA,"3040", "5169", "5108",NA, "5000","5043", "5046",NA)
)

Australian_Postcodes <- data.frame(
  place_name = c("Port Noarlunga","Essendon","Seaford Heights", "Salisbury","Briar Hill","Elizabeth", "Warradale",
           "Paralowie", "Morphett Vale"),
  post_code = c("5167","3040", "5169", "5108","3088", "5108","5043","5108","5162"),
  State = c("SA", "VIC", "SA", "SA","VIC","SA","SA","SA","SA")
)

Created on 2021-03-08 by the reprex package (v0.3.0)

This will do that, however, keep in mind that you will need the postcodes to be updated, otherwise you will have gaps.

library(dplyr)

df <- customers %>% 
  left_join(Australian_Postcodes, by = c("Postcode" = "post_code")) %>% 
  select(Name, Address, place_name, State.x, Postcode)

names(df) <- c("Name", "Address", "City", "State", "Postcode")

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.