Couple excerpts from the caret handbook might help:
Model-independent metrics
For classification, ROC curve analysis is conducted on each predictor. For two class problems, a series of cutoffs is applied to the predictor data to predict the class. The sensitivity and specificity are computed for each cutoff and the ROC curve is computed. The trapezoidal rule is used to compute the area under the ROC curve. This area is used as the measure of variable importance. For multi-class outcomes, the problem is decomposed into all pair-wise problems and the area under the curve is calculated for each class pair (i.e. class 1 vs. class 2, class 2 vs. class 3 etc.). For a specific class, the maximum area under the curve across the relevant pair-wise AUC’s is used as the variable importance measure.
From Measures for Class Probabilities
For data with two classes, there are specialized functions for measuring model performance. First, the twoClassSummary function computes the area under the ROC curve and the specificity and sensitivity under the 50% cutoff. Note that:
- this function uses the first class level to define the “event” of interest. To change this, use the
lev option to the function
- there must be columns in the data for each of the class probabilities (named the same as the outcome’s class levels)
twoClassSummary(test_set, lev = levels(test_set$obs))
## ROC Sens Spec
## 0.9560044 0.9336735 0.8246269
A similar function can be used to get the analugous precision-recall values and the area under the precision-recall curve:
prSummary(test_set, lev = levels(test_set$obs))
## AUC Precision Recall F
## 0.8582695 0.5648148 0.9336735 0.7038462
Also some nice figures re. ROC Curve vs AUC in this tutorial here:
https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc