Using library(zipcode) to generate 2 new columns?

I have zip code in my existing data frame of 18 variables and 65k observations and just downloaded library(zipcode). I can't figure out how to create two new columns with city and state (I would be ok with just "state" as well) and add to the existing data frame. In "Help" it reads that it is it's own data frame. Any ideas on how I can get the information I need? Under "Usage" in "Help" it reads, "data(zipcode", is that the name of the df that I should use?

UPDATE: I tried to merge and received the below error

data("zipcode")
APSIVMerged3 <- merge(APSIVMerged2,zipcode)
Error: cannot allocate vector of size 10.7 Gb

Update2: I also dropped columns latitude, longitude, and city since I only really need "state" and I still get the same error return message. I also tried:

gc()

to clear memory bc a different forum recommended someone try that. It did not help.

{zipcode} has been removed from CRAN. However at one time its dataset zipcode contained a data frame with variables zip, city, state, latitude and longitude.

First, make sure the zip code variable in your data frame is encoded character, left zero padded to five characters. Second, make sure that all zip codes in your data frame are in zipcode, like this

e <- mtcars
f <- mtcars
sum(!(e$mpg %in% f$mpg)) == 0
#> [1] TRUE

Created on 2022-12-28 with reprex v2.0.2

Then, to bring in the city and state columns missing from the data that are in the zipcode file, like

e <- mtcars
f <- mtcars
sum(!(e$mpg %in% f$mpg)) == 0
#> [1] TRUE
# fake a variable present in each
set.seed(42)
e$key <- sample(letters,nrow(e),replace = TRUE)
set.seed(137)
f$key <- sample(letters,nrow(f),replace = TRUE)
# fake a variable existing in one but not the other
f$fake <- sample(LETTERS,nrow(f), replace = TRUE)

merge(e,f)
#>   mpg cyl disp  hp drat   wt qsec vs am gear carb key fake
#> 1  15   8  301 335 3.54 3.57 14.6  0  1    5    8   c    W

The equivalent of zipcode is the key variable, where the value c occurred in both e and f, which in this case is only one each. Based on that key match, the variable fake was brought into e from f.

I was able to download that library from CRAN. The issue ended up being the error code that I provided in the edit/update in my posting. Thanks for the thoughtful response (see commands/attempts below screenshot)

image

The error indicates that the operating system has a limit on the amount of memory it will release for any given thread. In UNIX-like environments, the default is 8GB and that can be adjusted. See man ulimit for details. Can't help for Win.

To drop longitude and latitude, determine their index position (the last two in colnames(yourdata) probably). Say 15 and 16.

yourdata <- yourdata[-c(15,16)]

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.