How do I improve my model performance?

Hi, I've been trying to use different ML algorithm to improve the performance of my model, so far, the sensitivity is quite low with the highest stagnating around 20%. Please, what should I do to improve the model performance?

split <- sample.split(ml$Status, SplitRatio = 0.75)
train <- subset(ml, split == TRUE)
test <- subset(ml, split == FALSE)

#as.data.frame(table(train$Status))
train$Status <- as.factor(train$Status)
train <- SMOTE(Status ~., train, perc.over = 350,perc.under = 161)
round(prop.table(table(dplyr::select(train, Status), exclude = NULL)),4)
#...........................................................................
#Model Development
## fit a logistic regression model with the training dataset
log.model <- glm(Status ~., data = train[-c(5)], family = binomial)
summary(log.model)
colnames(test)
test[1:8,]
#.............................................................................
## to predict using logistic regression model, probabilities obtained
log.predictions <- predict(log.model,test, type="response")
logit.pred <- as.factor(ifelse(log.predictions > 0.5, 1, 0))

## Look at probability output
head(logit. pred, 10)
#..............................................................................
#View the confusion matrix of logistic regression.
#table(logit.pred,test[,8])
caret::confusionMatrix(as.factor(logit.pred), as.factor(test$Status), positive = "1")
#(final_df <- cbind(model_input_df[, c(1,2)], "pred_values" = logit.pred))
#head(final_df)

Do you have reason to believe that a higher True Positive Rate than 20% should be possible given your data and modelling context ?

would you be willing to accept a higher False Negative Rate, as a tradeoff ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

@nirgrahamuk thank you for the options. I believe a higher true positive rate is possible With the available data, and much higher if more features are added.

A higher false negative rate more than true positive rate will definitely be a NO.

The essence of the model is to predict if a customer has bypassrd their meter or not. The objective is to reduce cost of physical inspection.