Predicting with testdata

rstudio
randomforest

#1

I fitted a random forest model:

rf <- randomForest(as.factor(class) ~ B2 + B3 + B4, data=training,
importance=TRUE,
ntree=2000, na.action = na.omit)

I am then trying to predict using the testing dataset:
rf.predT <- predict(rf, testing)

But I am getting this error:

rf.predT <- predict(rf, testing)
Error in eval(predvars, data, env) : object 'B2' not found

But names(training) shows that it does contain 'B2'

names(training)
[1] "B1" "B2" "B3" "B4" "B5" "B6" "class"

Any solutions?


#2

Does testing contain B2?


#3

testing did not contain B2 at first but it does now. Therefore, rf.predT <- predict(rf, testing) ran successfully. But when I try to assess the overall accuracy if the classification I get the following error:

confusionMatrix(rf.predT , testing$class)$overall[1]
Error: data and reference should be factors with the same levels.

levels(rf.predT)
[1] "10" "11" "12" "13" "14" "15" "16"
levels(testing$class)
NULL

How do I ensure the objects testing and rf.predT have the same levels?


#4

You could apply factor() before splitting into training and testing. In this specific case I believe testing$class is not factor, I would try confusionMatrix(rf.predT , factor(testing$class))


#5

I used confusionMatrix(rf.predT , factor(testing$class)) and it worked. Thank you so much.