# Understanding the classification tree from tree package

I am trying to understand how the tree package in R works. The following code is from the an introduction o Statistical learning textbook.

library (tree)
library (ISLR2)
attach (Carseats)
set.seed(32603)
High <- factor(ifelse(Sales <= 8, "No", " Yes ") )
Carseats <- data.frame(Carseats , High)

##### fit a model on all variables except Sales

tree.carseats <- tree(High~.-Sales,Carseats)
summary(tree.carseats)
tree.carseats
plot (tree.carseats)
text (tree.carseats , pretty = 0)

My question is how does the algorithm decide when to stop? I see there are 5 observations in the bottom most nodes. Is there a threshold that when the number of observations is equal to that threshold the algorithm stops?

The help for the `tree()` function shows the following arguments, which include `control`

``````tree(formula, data, weights, subset,
na.action = na.pass, control = tree.control(nobs, ...),
method = "recursive.partition",
split = c("deviance", "gini"),
model = FALSE, x = FALSE, y = TRUE, wts = TRUE, ...)
``````

The description of `control` is

``````control    A list as returned by tree.control
``````

The help for `tree.control` shows

``````Usage
tree.control(nobs, mincut = 5, minsize = 10, mindev = 0.01)

Arguments
nobs   The number of observations in the training set.

mincut   The minimum number of observations to include in either child node. This is a
weighted quantity; the observational weights are used to compute the ‘number’.
The default is 5.

minsize  The smallest allowed node size: a weighted quantity. The default is 10.

mindev   The within-node deviance must be at least this times that of the root node for the
node to be split.
``````

So, it seems that the minimum number of observations to include in a child node is 5 by default but you can adjust that.

