Representative Random Forest Plot

Hi All,

I wanted to generate a representative decision tree plot from a random forest output.

Thus far, I have found a couple of routines: 'reprtee' and one using the 'caret' package.

The reptree routine is as follows:

library(randomForest)
library(reprtree)

model <- randomForest(Species ~ ., data=iris, importance=TRUE, ntree=500, mtry = 2, do.trace=100)

reprtree:::plot.getTree(model)

However, when I use the reptree routine on my own data, the tree is very large. I wondered if anyone knew how to control the depth and complexity of the tree in reptree?

If anyone has alternative methods, that would also be appreciated :slight_smile:

This is not the representative tree, its precisely the first tree from the set of trees composing the random forest, in full. You can modify with depth = some integer to get that actual tree to a given max depth, and k to choose a different tree from the forest. But this would also not be a representative tree.
I think you may be intending

reprtree:::plot.reprtree( ReprTree(model, iris, metric='d2'))

I think this latter also supports a depth param, try it out.

Thanks @nirgrahamuk . I appreciate the new code.

It certainly prints out a tree-like structure. However, it is so large as a tree structure, it blurs over the page/looks uninterpretable.

Do you think this is a model-specific problem (my model has around 12 independent variables), or do you think with more tuning, it could be made to look as tidy and interpretable as a decision tree produced by CART-like methods?

Is this after trying the depth parameter?

Yeah, with the depth parameter set = 3.

I specified my model as:

model <- randomForest(Y ~ ., 
data=train, importance=TRUE, ntree=500, mtry = 3, do.trace=100, depth = 3)

The depth parameter didn't seem available in the ReprTree function by comparison.

No, you want the randomForest to be whatever depth it is, but your representative tree to have a human readable friendly depth.
I was telling you about reprtree:::plot.reprtree() having a depth param i.e.

reprtree:::plot.reprtree( ReprTree(model, iris, metric='d2'),depth=3)

Thanks @nirgrahamuk. Much much better. I appreciate your help.

Can I ask if you know of the other metrics available for constructing the tree beyond the distance metric used in the example above?

sadly, the documentation of reprtree makes it clear that only d2 has been implemented.
What metric would you prefer ?

No no. I didn't have a metric in mind, must admit. I've just always associated with that type of metric with unsupervised clustering etc.

However, again, I appreciate your help with the code above (and all the other instances you've helped recently).

Thanks for your time :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.