How to control a decision tree?

Hi everybody,

this is my R script for a decision tree:

    library(caret)
    library(rpart.plot)
    library(plyr)
    library(dplyr)
    library(rpart)
    
    data("iris")
    
    names(iris) = tolower(names(iris))
    
    table(iris$species)
    suppressMessages(library(caret))
    
    index = createDataPartition(y=iris$species, p=0.7, list=FALSE)
    
    train = iris[index,]
    test = iris[-index,]
    
    trainctrl <- trainControl(method = "cv", number = 5, verboseIter = FALSE)

dt.model <- train(species~., data=train, method = "rpart", 
                  tuneLength = 10,
                  preProcess = c("center", "scale"),
                  trControl = trainctrl,
                  metric="Kappa")

dt.predict <-predict(dt.model, test)
confusionMatrix(dt.predict, test$species)

rpart.plot(dt.model$finalModel)
varImp(dt.model)

my feature importance are:

> varImp(dt.model)
rpart variable importance

             Overall
petal.width   100.00
petal.length   96.95
sepal.length   45.08
sepal.width     0.00

Is there a way to consider less petal.width? For example, I want that my tree use more petal.length and sepal.lenght than petal.width. Is it possible?

Add random noise to your petal.width data ?

My problem is that in my dataset that I am using the decision tree on, one variable x is used more than the others. This x variable, however, affects the value of another y variable which is the one that actually characterizes my classes. Using an example with Iris: it's like saying that petal.length is related proportionally to petal.width, ie:

petal.width petal.length class
1            4           Virginica
1            6           Setosa
2            6            Virginica
2            7            Setosa

my model discriminates more the classes in base to the "petal.width" of my dataset rather than to the "petal.lenght". For this I would want to give a various weight to the characteristics

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.