ConfusionMatrix for imbalanced data

Hey guys. I succesfully sampled my train data and now i want to evaluate if it helped with accuracy and stuff like that. I want to use a ConfusionMatrix for that. First i want to use it on my train data and afterwards on the the sampled data. But befor i could even think about that, i need to get it working for the train data. This is the error i face right now.

Error: data and reference should be factors with the same levels.

And this is my code:

# Datensatz aufrufen und speichern
setwd("C:\\Users\\...\\Dropbox\\Uni\\Präsentation\\Datensätze")
data <- read.csv("creditcard.csv")
head(data)
prop.table(table(data$Class))
table(data$Class)
summary(data)
str(daten)

# AUfteilen der Daten in Train/Test-Data
library(caret)
index <- createDataPartition(data$Class, p = 0.8, list = FALSE)
train_data <- data[index, ]
test_data  <- data[-index, ]

# Verteilung der Daten
table(train_data$Class)
prop.table(table(train_data$Class))
nrow(train_data)
table(test_data$Class)
prop.table(table(test_data$Class))
nrow(test_data)

# Confusion-Matrix für Test-Data
library(rpart)
library(caret)
install.packages("e1071")
library(e1071)
fit_train <- rpart(Class ~ ., data = train_data, method = "class")
summary(fit_train)
rpart.plot(fit_train)
pred_fit_train <- predict(fit_train, newdata = test_data, type = "class")
table(test_data$Class, pred_fit_train)

#confusion.matrix(test_data,pred_fit_train,threshold=0.5)
#confusionMatrix(predict(fit_train, test_data), test_data$Class, positive = '1')

# Accuarcy / Specififity / Sensititivty / Precision / Recall der Test-Daten
confusionMatrix(data = as.integer(pred_fit_train) ,
                reference =  test_data$Class,
                positive = "1")

How do i get rid of this error?

And if you use confusionMatrix(pred_fit_train, test_data$Class) ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.