Statistics from a confusion matrix

Hi all,

Is there a command in tidymodels that creates a confusion matrix and calculates all the relevant statistics from it (precision, recall, F1, etc.) at the same time? Something similar to confusionMatrix(data = predictions, reference = truth, mode = "prec_recall") in caret? All I've seen from looking around is that the individual statistics have to be calculated using individual commands, but I might have missed things. Thanks!


Have you looked at yardstick for this? It seems like they have exclusively created this package as part of the tidymodels. See here: Tidy Characterizations of Model Performance • yardstick and specifically here at all the various classes you can specify: Metric types • yardstick

You can use conf_math (for categorical data.)

or Tidy, in a more general form (Hypothesis, cluster, and so on.)

finally you can see a related example here:

Hope this helps.

Hi hope this helps

#Just enter only these 4 figures and these codes will get you all statistics relating two by two table of diagnostic test data
TP <- 20
FP <- 180
FN <- 10
TN <- 210


mConfusionMatrix <- matrix(c(TP, FN, FP, TN), nrow = 2)
mfisher <- fisher.test(mConfusionMatrix)

mPrevalence <- (TP+FN)/TOTSAM #Prevalence
mAccuracy <- (TP+TN)/TOTSAM # Accuracy
mPPV <- TP/(TP+FP) #Precision positive predictive value
mNPV <- TN/(FN+TN) #negative predictive value
mFDR <- FP/(TP+FP) # False Discovery Rate
mFOR <- FN/(FN+TN) # False omission rate
mTPR <- TP/(TP+FN) #sensitivity or recall
mFPR <- FP/(FP+TN) #fall-out; or false alarm
mFNR <- FN/(TP+FN) #False Negative rate or Miss rate
mTNR <- TN/(FP+TN) #Specificity
mPosLR <- (TP/(TP+FN))/(FP/(FP+TN)) # Positive likelihood ratio
mNegLR <- (FN/(TP+FN)) /(TN/(FP+TN)) # negative likelihood ratio
mDOR <- ((TP/(TP+FN))/(FP/(FP+TN)))/((FN/(TP+FN)) /(TN/(FP+TN)))
mPrevalenceTH <- (sqrt(mTPR*(1-mTNR)) +mTNR -1) / (mTPR + mTNR-1) #Prevalence Threshold
mF1Score <- 2 * (mPPVmTPR) / (mPPV+mTPR) # F1 score
FN) - (TPTN))^2) /(TP+FP) / (FN+TN) / (TP+FN) / (FP+TN)
FN) - (TPTN)) / ((FPFN) + (TP*TN))

#find Chi-Square critical value
mChisq_Critical <- qchisq(p=.05, df=1, lower.tail=FALSE)
mPvalChiSq <-pchisq(mCHISQ, df=1, lower.tail=FALSE)

mepiOR <- epitab(c(TP,FP,FN,TN),method = 'oddsratio')

#odds ratio and confidence interval
mOR <- mepiOR$tab[10] #Odds Ratio
mORCIL <- mepiOR$tab[12]
mORCIU <- mepiOR$tab[14]
mORPValue <-round(mepiOR$tab[16],4)

#Relative risk and confidence interval
mRR <- (TP / (TP+FP)) / (FN / (FN+TN)) #Relative Risk
mRRSE <- sqrt((1/TP - 1/(TP+FP))+(1/FN - 1/(FN+TN)) ) #Standard Error for Odds Ratio
mRRCIL <- exp(log(mRR)- (1.95mRRSE))
mRRCIU <- exp(log(mRR)+ (1.95
mRRZSTAT <- log(mRR)/mRRSE
mRRPValue <- round(2*pnorm(-abs(mRRZSTAT)),4)

mEER <- TP / (TP+FP) #Experimental Event Rate
mCER <- FN / (FN+TN) #Control Event Rate
mRRR <- (mCER-mEER)/mCER #Relative Risk Reduction
mRRI <- (mEER-mCER)/mCER #Relative Risk Increase
mARR <- mCER-mEER #Absolute Risk Reduction
mARI <- mEER-mCER #Absolute Risk Increase
mNNT <- 1/mARR #Number Needed to Treat
mNNH <- 1/mARI #Number Needed to Harm

Yea this works. I was hoping there is a single command that I can use so that R spits out all the commonly seen statistics from a confusion matrix. Something like this below. Thanks anyway

This works for sure, but I'm actually looking for a way to do one line of code and have R spit out all the statistics by itself as opposed to me having to include all that code

Just create a wrapper function around several of the metrics you require.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.