Hello!
I am running a random forest model , my data includes an ID which is being used as a predictor. how do I keep it in the dataset but not use it as a predictor?
Note: This ID is appearing as an important factor yet it should not even be relevant in the prediction.
Please help.
snippet of the random forest
classifier = randomForest(dpt3~., data=train,
importance = TRUE, ntree = 1000,mtry = 4,
replace=TRUE, random_state = 0)
when i check variable importance here is what i get, it is the top.
randomForest::importance(classifier)
1 2 3 MeanDecreaseAccuracy MeanDecreaseGini
public_id 60.485515 97.234941 79.874132 149.191509 2258.26709