Little help with for loop and pattern matching

Greetings dear community.

I have a data set looks like this

data99 <- data.table(Col1 = c("ZipCodes","1950","1951","NA","1951","1952","10000"),
          Col2= c("Places","Street","City","Neighbourhood","NA","neighbourhood/city","Street"))
colnames(data99) <- as.character(data99[1, ])
data99 <- data99[-1,]

#>    ZipCodes    Places
#>1	   1950	      Street
#>2    1951	      City
#>3	   NA         Neighbourhood
#>4	   1951	      NA
#>5	   1952	      Neighbourhood/city
#>6	   10000	      Street

I didnt have any other dataset so I've created 3 different datasets. First datasets include the all streets and cities they belonged to.

data98 <- data.table(Col1 = c("City","NY","NY","NY","LA","Boston","Boston"),
          Col2= c("Places","WallStreet","Canal","Bowery","RodeoDrive","Charles","Centre"))
colnames(data98) <- as.character(data98[1, ])
data98 <- data98[-1,]

#> City   Streets
#>1	NY	  WallStreet
#>2	NY	  Canal
#>3	NY	  Bowery
#>4	LA	  RodeoDrive
#>5	Boston	Charles
#>6	Boston	Centre

2nd and 3rd datasets has the same logic. 2nd has Neighbourhood - City Match and 3rd one has District - City match.

What I want to do here, I want to use this new datasets to fill gaps with city names in data99. Zipcodes doesnt matter actually, I just need if there is only neighbourhood written in the cell. In the end it should look like this:

data97 <- data.table(Col1 = c("City","NY","NY","NY","LA","Boston","Boston"),
                    Col2= c("Places","WallStreet","Canal","Bowery","RodeoDrive","Charles","Centre"),
                    Col3= c("ZipCodes","1950","1951","NA","1951","1952","10000"))
                    colnames(data97) <- as.character(data97[1, ])
                    data97 <- data97[-1,]
#> City      Places        ZipCodes
#>1	NY	     WallStreet	     1950
#>2	NY	     Canal	         1951
#>3	NY	     Bowery	          NA
#>4	LA	     RodeoDrive	     1951
#>5	Boston	 Charles	     1952
#>6	Boston	 Centre	         10000

I want to add a column near it and bring the city value from other dataset. I guess I need a for loop to check empty cells and look it up from other data. I've also tried to merge datasets together based on Places column but that increased my number of entries from 50k to 70k. Probably some of the streetnames are reapeating in different cities. Some of the places name in data99 is typos so I dont think I can do anything about those. Thank you for your helps. Have a nice day.

One of your colleagues has already asked this:

Aha I wonder who that is :)) Ok I'll try this solution but like I mentioned merge function didnt work for me. It creates extra lines instead of filling the empty lines.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.