Type error after rbind


#1

Hi All,

I got this Error after doing rbind. I've seen similar errors in my search to solve the issue online but am still not sure how to solve mine. Please advise. Thank you.

after joining two data frames:

Warning message:
In `[<-.factor`(`*tmp*`, ri, value = c(1881L, 1888L, NA, NA, NA,  :
  invalid factor level, NA generated
> View(Combined_Schools2018)
> str(Combined_Schools2018)
'data.frame':	14301 obs. of  42 variables:
 $ CDSCODE_  : chr  "_01100170000000" "_01100170109835" "_01100170112607" "_01100170118489" ...
 $ CDSCODE   : chr  "01100170000000" "01100170109835" "01100170112607" "01100170118489" ...
 $ NCESDist  : int  691051 691051 691051 691051 691051 691051 691051 691051 691051 691051 ...
 $ NCESSchool: int  NA 10546 10947 12283 12844 12901 13008 9264 6830 9265 ...
 $ StatusType: Factor w/ 4 levels "Active","Closed",..: 1 2 1 2 1 1 1 1 1 2 ...
 $ County    : Factor w/ 58 levels "Alameda","Alpine",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ District  : Factor w/ 1384 levels "ABC Unified",..: 6 6 6 6 6 6 6 6 6 6 ...
 $ School    : Factor w/ 11288 levels " ","\"Virtual\" Pre",..: 1 3295 3127 625 2075 10980 10108 162 161 163 ...
 $ Street    : Factor w/ 11561 levels " ","1 Carnes Road",..: 6000 6982 2396 4138 4105 557 8519 4949 6000 6000 ...
 $ StreetAbr : Factor w/ 11591 levels " ","1 Carnes Rd.",..: 6013 6999 2401 4145 4112 558 8540 4959 6013 6013 ...
 $ City      : Factor w/ 1102 levels " ","Acampo","Acton",..: 388 643 672 79 672 672 672 856 388 388 ...
 $ Zip       : Factor w/ 10798 levels " ","89447","90001-1133",..: 7172 7317 7638 7673 7585 7604 7657 7413 7172 7172 ...
 $ State     : Factor w/ 2 levels " ","CA": 2 2 2 2 2 2 2 2 2 2 ...
 $ MailStreet: Factor w/ 10265 levels " ","1 Carnes Road",..: 5059 5878 2031 97 3440 449 7186 4162 5059 5059 ...
 $ MailStrAbr: Factor w/ 10292 levels " ","1 Carnes Rd.",..: 5069 5893 2036 97 3447 450 7205 4170 5069 5069 ...
 $ MailCity  : Factor w/ 1061 levels " ","Acampo","Acton",..: 382 623 651 651 651 651 651 821 382 382 ...
 $ MailZip   : Factor w/ 9902 levels " ","65546","81324",..: 6545 6667 6968 6909 6918 6937 6989 6751 6545 6545 ...
 $ MailState : Factor w/ 2 levels " ","CA": 2 2 2 2 2 2 2 2 2 2 ...
 $ Phone     : Factor w/ 10828 levels " ","(209) 223-1750",..: 2755 1 2533 1 2631 2445 2731 2628 2629 1 ...
 $ Ext       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ Website   : Factor w/ 4439 levels " ","acornstooakscharter.org",..: 887 1 1651 958 1390 4258 4031 887 887 887 ...
 $ OpenDate  : Factor w/ 1443 levels " ","1/1/1938",..: 1 980 968 851 864 1075 954 481 481 481 ...
 $ ClosedDate: Factor w/ 628 levels " ","1/1/2002",..: 1 445 1 345 1 1 1 1 1 343 ...
 $ Charter   : Factor w/ 3 levels " ","N","Y": 1 3 3 3 3 3 3 2 2 2 ...
 $ CharterNum: Factor w/ 1700 levels " ","0001","0003",..: 1 698 778 1000 1229 1241 1323 1 1 1 ...
 $ FundingTyp: Factor w/ 4 levels " ","Directly funded",..: 1 2 2 2 2 3 2 1 1 1 ...
 $ DOC       : int  0 0 0 0 0 0 0 0 0 0 ...
 $ DOCType   : Factor w/ 12 levels "Administration Only",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ SOC       : int  NA 65 66 66 60 60 60 14 10 13 ...
 $ SOCType   : Factor w/ 20 levels " ","Adult Education Centers",..: 1 14 9 9 8 8 8 13 5 15 ...
 $ EdOpsCode : Factor w/ 14 levels " ","ALTSOC","COMM",..: 1 13 13 13 13 13 13 7 3 8 ...
 $ EdOpsName : Factor w/ 14 levels " ","Alternative School of Choice",..: 1 13 13 13 13 13 13 8 5 9 ...
 $ EILCode   : Factor w/ 8 levels " ","A","ELEM",..: 1 4 5 5 3 3 3 5 5 5 ...
 $ EILName   : Factor w/ 8 levels " ","Adult","Elementary",..: 1 4 5 5 3 3 3 5 5 5 ...
 $ GSoffered : Factor w/ 101 levels " ","1","1-12",..: 1 67 59 59 74 74 74 67 67 48 ...
 $ GSserved  : Factor w/ 111 levels " ","1","1-10",..: 1 77 72 72 82 80 80 61 61 67 ...
 $ Virtual   : Factor w/ 6 levels " ","F","N","P",..: 1 4 3 3 3 3 3 3 3 1 ...
 $ Final_Long: num  0 -122 -122 -122 -122 ...
 $ Final_Lat : num  0 37.5 37.8 37.9 37.8 ...
 $ GEOID00   : num  NA 6001444600 6001402900 6001423000 6001405900 ...
 $ GEOID10   : num  NA 6001444602 6001402900 6001423000 6001406000 ...
 $ GEOID15   : num  NA 6001444602 6001402900 6001423000 6001406000 ...

#2

The error message isn't as helpful as it could be (it would be nice to give a better pointer to the problematic column),
but I have seen this error when a column with strings are converted into factors, and in the row bind, the new columns has values that are not a defined level.

Here's a reprex recreating this cause:

library(dplyr)
df1 <- data.frame(
  c1 = as.factor(LETTERS[1:3])
)
str(df1)
#> 'data.frame':    3 obs. of  1 variable:
#>  $ c1: Factor w/ 3 levels "A","B","C": 1 2 3

df2 <- data.frame(
  c1 = c(2:6)
  )
str(df2)
#> 'data.frame':    5 obs. of  1 variable:
#>  $ c1: int  2 3 4 5 6

rbind(
  df1, df2
)
#> Warning in `[<-.factor`(`*tmp*`, ri, value = 2:6): invalid factor level, NA
#> generated
#>     c1
#> 1    A
#> 2    B
#> 3    C
#> 4 <NA>
#> 5 <NA>
#> 6 <NA>
#> 7 <NA>
#> 8 <NA>

Created on 2018-10-23 by the reprex package (v0.2.1)

For example, discussion on stackoverflow:


One quick an easy fix is to make sure these two columns are factors;


library(dplyr)
df1 <- data.frame(
  c1 = as.factor(LETTERS[1:3])
)
str(df1)
#> 'data.frame':    3 obs. of  1 variable:
#>  $ c1: Factor w/ 3 levels "A","B","C": 1 2 3

df2 <- data.frame(
  c1 = as.factor(c(2:6))
)

rbind(
  df1, df2
)
#>   c1
#> 1  A
#> 2  B
#> 3  C
#> 4  2
#> 5  3
#> 6  4
#> 7  5
#> 8  6

Created on 2018-10-23 by the reprex package (v0.2.1)


It might be helpful to see the str for both data.frames you're row binding.