Hi guys, I am trying to run a decision tree with binary variables (dependent and all independent variables are binary).
However, I got this in the split
what does gendermen< 0.03937576 mean? after 2)
Was 0.03937576 the probability? Since 0 will be no and 1 will be yes. If 0.5 was set up as the cutoff, then gendermen<0.039 will be put in the "No" category, so it represents "women?"
gendermen looks like a dummy variable created by train.formula() and gendermen< 0.03937576 should capture people who were not men (unless you centered and scaled these data).
The numbers in () are the rates of the outcome for that subgroup which predominantly had the class "no"
It is more interpretable if you don't convert them to dummy variables; the split would look like gender = "men".