I am using the randomForest command for regression. I was wondering if there is a way to get the data sample for a single tree when using the
randomForest package? I am aware that I can get the structure for a single tree using the
gettree command. The returned object contains predicted values for the outcome, and I am trying to find the actual outcome so that I can calculate the MSE for that particular tree. I know that the
randomForest object contains this information but it would be neat to do is manually as well.
As per GreyMerchant's request, I've included a reprex below. I use the
airquality data set included with R.
require(randomForest) # Make this example reproducible set.seed(1) # Fit the random forest model model <- randomForest( formula = Ozone ~ ., data = airquality ) # Get first tree tree_1 <- gettree(model, k = 1)
Now I would like to somehow get the original data that was used to make predictions for tree number 1. For normal regressions I would simply use the data from the
data argument, but from my understanding , a Random Forest model will fit trees using sub-samples of the original data.
I want the original data so that I can manually calculate the output from