Error: Missing data in columns: pop when running random forest regression using the ranger package

I am trying to implement random forest (RF) regression using the ranger package in R, but I am getting this error: Error: Missing data in columns: pop (pop is my independent variable) when running the ranger function.

For reference, when using the randomForest package, I can use the na.action = na.omit function to exclude the NA values, but in ranger I can't do this.

I have tried to to something like (among other things):

m <- ranger(ntl ~ ., 
                      data = as.data.frame(na.omit(s)), 
                      mtry = 1, 
                      importance = impurity)

but without success.

How can I exclude the NA values when I run the ranger function?

Here is the code:

library(terra)
library(ranger)

wd = "path/"

ntl = rast(paste0(wd, "ntl2.tif"))

rlist = list.files(path = wd,
                   pattern = "^pop\\d+\\.tif$",
                   all.files = T,
                   full.names = F)

for (i in rlist){
  for (j in i) {
    
    nameNum = gsub("\\D+","",j)
    print(nameNum)
    print(j)
    
    pop = rast(paste0(wd, j))
    
    s = c(ntl, pop)
    names(s) = c("ntl", "pop")
    
    m <- ranger(ntl ~ ., 
                      data = as.data.frame(s), 
                      mtry = 1, 
                      importance = impurity) 

     p <- predict(s, m)
     rsds <- s$ntl - p

     writeRaster(rsds,
            filename = paste("path/rf_resids", 
                             nameNum, 
                             ".tif", 
                             sep=""),overwrite = T)
  }
}

The reason I want to use the ranger package over randomForest is that it is faster to execute.

Two sample rasters:

ntl = rast(ncols=109, nrows=80, nlyrs=1, xmin=-31400, xmax=12200, ymin=6012900, ymax=6044900, names=c('ntl'), crs='PROJCRS[\"World_Mollweide\",BASEGEOGCRS[\"WGS 84\",DATUM[\"World Geodetic System 1984\",ELLIPSOID[\"WGS 84\",6378137,298.257223563,LENGTHUNIT[\"metre\",1]],ID[\"EPSG\",6326]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"Degree\",0.0174532925199433]]],CONVERSION[\"unnamed\",METHOD[\"Mollweide\"],PARAMETER[\"Longitude of natural origin\",0,ANGLEUNIT[\"Degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"False easting\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]],CS[Cartesian,2],AXIS[\"(E)\",east,ORDER[1],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]],AXIS[\"(N)\",north,ORDER[2],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]]]')

pop010 = rast(ncols=109, nrows=80, nlyrs=1, xmin=-31400, xmax=12200, ymin=6012900, ymax=6044900, names=c('focal_sum'), crs='PROJCRS[\"World_Mollweide\",BASEGEOGCRS[\"WGS 84\",DATUM[\"World Geodetic System 1984\",ELLIPSOID[\"WGS 84\",6378137,298.257223563,LENGTHUNIT[\"metre\",1]],ID[\"EPSG\",6326]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"Degree\",0.0174532925199433]]],CONVERSION[\"unnamed\",METHOD[\"Mollweide\"],PARAMETER[\"Longitude of natural origin\",0,ANGLEUNIT[\"Degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"False easting\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]],CS[Cartesian,2],AXIS[\"(E)\",east,ORDER[1],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]],AXIS[\"(N)\",north,ORDER[2],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]]]')

the solution:

m <- ranger(ntl~., data=as.data.frame(s, na.rm=TRUE), mtry=1)
p <- predict(s, m, na.rm=TRUE)
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.