Is it possible to get the thresholds used to calculate Area Under the Curve?

caret

#1

Here's an example script that fits a xgb model using evaluation metric prAUC (prSummary in trainControl with metric set to "AUC" in train)

library(caret)
library(tidyverse)

# example data
sampledata <- diamonds %>% mutate(target = ifelse(cut == "Premium", 1, 0) %>% make.names() %>% as.factor())

# fit a XGB model
train_control <- trainControl(
  method = "cv",
  number = 5,
  classProbs = TRUE,
  verboseIter = TRUE,
  summaryFunction = prSummary,
  savePredictions = TRUE,
  allowParallel = TRUE
)

## tuning grid
tune_grid <- expand.grid(nrounds = 200,
                         max_depth = 5,
                         eta = 0.05,
                         gamma = 0.01,
                         colsample_bytree = 0.75,
                         min_child_weight = 0,
                         subsample = 0.5)

## xgb
xgb_model <- train(
               x = select(sampledata, -c(cut, target, clarity, color)), 
               y = sampledata$target,
               method = "xgbTree",
               metric = "AUC", #actually prAUC sinc eusing prSummary
               trControl = train_control,
               tuneGrid = tune_grid,
               tuneLength = 10)

After running this if I type xgb_model I see (pr)AUC of 0.978. My question is, is it possible to retrieve the thresholds that were used in this calculation?

A friend is running a similar model on python with scikit learn and we'd like to see if we can more closely compare our models. See https://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html

From that link "This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic". If I was able to determine how many cut offs are being used by caret we could do a more like for like comparison of our models.

I glimpsed the model xgb_model but could not see anything called "threshold" though I'm sure I've come across the term within context of R in the past. Is there a way to get these threholds?


#2

You simply use the predictions as 'thresholds', so e.g. given:

> data.frame(y_pred = runif(10), y_true = sample(c(0,1), 10, replace = TRUE))
      y_pred y_true
1  0.6715478      0
2  0.8195236      0
3  0.9768838      1
4  0.5271831      0
5  0.6940112      0
6  0.2799723      0
7  0.9881248      0
8  0.8947085      0
9  0.6890358      1
10 0.6081315      0

The y_pred is used as 'thresholds', when calculating the AUC

Hope it helps :slightly_smiling_face:


#3

Thank you for clarifying


#4

YW :slightly_smiling_face:


#5

You can rip out the underlying predictions and then use yardstick to compute the full PR AUC curve using pr_curve(). You can actually compute the area under the curve using pr_auc() which uses the data from pr_curve().

library(caret)
library(tidyverse)
library(yardstick)

# to see lots of decimal places in a tibble
options(pillar.sigfig = 8)

# for reproducibility
set.seed(123)

# example data
sampledata <- diamonds %>% mutate(target = ifelse(cut == "Premium", 1, 0) %>% make.names() %>% as.factor())

# fit a XGB model
train_control <- trainControl(
  method = "cv",
  number = 5,
  classProbs = TRUE,
  verboseIter = TRUE,
  summaryFunction = prSummary,
  savePredictions = TRUE,
  allowParallel = TRUE
)

## tuning grid
tune_grid <- expand.grid(nrounds = 200,
                         max_depth = 5,
                         eta = 0.05,
                         gamma = 0.01,
                         colsample_bytree = 0.75,
                         min_child_weight = 0,
                         subsample = 0.5)

## xgb
xgb_model <- train(
  x = select(sampledata, -c(cut, target, clarity, color)), 
  y = sampledata$target,
  method = "xgbTree",
  metric = "AUC", #actually prAUC sinc eusing prSummary
  trControl = train_control,
  tuneGrid = tune_grid,
  tuneLength = 10)

# Take a look at the model
xgb_model
#> eXtreme Gradient Boosting 
#> 
#> 53940 samples
#>     7 predictor
#>     2 classes: 'X0', 'X1' 
#> 
#> No pre-processing
#> Resampling: Cross-Validated (5 fold) 
#> Summary of sample sizes: 43153, 43151, 43152, 43152, 43152 
#> Resampling results:
#> 
#>   AUC        Precision  Recall     F        
#>   0.9702641  0.925353   0.8808189  0.9025308
#> 
#> Tuning parameter 'nrounds' was held constant at a value of 200
#>  0.75
#> Tuning parameter 'min_child_weight' was held constant at a value
#>  of 0
#> Tuning parameter 'subsample' was held constant at a value of 0.5

# Inside $pred is all the info we need
# to compute the full pr auc curve
xgb_model$pred %>%
  select(obs, X0, Resample) %>%
  slice(1:5)
#>   obs        X0 Resample
#> 1  X0 0.9783865    Fold1
#> 2  X0 0.9885217    Fold1
#> 3  X1 0.2936884    Fold1
#> 4  X0 0.9990451    Fold1
#> 5  X0 0.7013983    Fold1

# calculate the full pr curve with thresholds
xgb_pr_curve <- xgb_model$pred %>%
  group_by(Resample) %>%
  pr_curve(obs, X0)

head(xgb_pr_curve)
#> # A tibble: 6 x 4
#> # Groups:   Resample [1]
#>   Resample   .threshold        recall precision
#>   <chr>           <dbl>         <dbl>     <dbl>
#> 1 Fold1    Inf          0                    NA
#> 2 Fold1      0.99938977 0.00012459507         1
#> 3 Fold1      0.99936575 0.00024919013         1
#> 4 Fold1      0.99936062 0.00037378520         1
#> 5 Fold1      0.99935418 0.00049838026         1
#> 6 Fold1      0.99935120 0.00062297533         1

# plot it!
autoplot(xgb_pr_curve)
#> Warning: Removed 5 rows containing missing values (geom_path).


# To show that this is basically the same information
# caret uses to compute the pr auc score,
# compute the PR AUC per resample using yardstick
pr_auc_per_resample <- xgb_model$pred %>%
  group_by(Resample) %>%
  pr_auc(obs, X0)

pr_auc_per_resample
#> # A tibble: 5 x 4
#>   Resample .metric .estimator  .estimate
#>   <chr>    <chr>   <chr>           <dbl>
#> 1 Fold1    pr_auc  binary     0.96888428
#> 2 Fold2    pr_auc  binary     0.97038900
#> 3 Fold3    pr_auc  binary     0.97103242
#> 4 Fold4    pr_auc  binary     0.97116810
#> 5 Fold5    pr_auc  binary     0.96971711

# Average them to get the resampled pr auc score
pr_auc_per_resample %>%
  summarise(
    resampled_pr_auc = mean(.estimate)
  )
#> # A tibble: 1 x 1
#>   resampled_pr_auc
#>              <dbl>
#> 1       0.97023818

# There is a slight difference between this
# and the PR AUC result from MLmetrics, but im not
# sure why at the moment. This is the PR AUC you get from
# MLmetrics, which aligns with what caret shows you
xgb_model$pred %>%
  group_by(Resample) %>%
  summarise(
    pr_auc = MLmetrics::PRAUC(X0, ifelse(obs == "X0", 1, 0))
  ) %>%
  summarise(
    resampled_pr_auc = mean(pr_auc)
  )
#> # A tibble: 1 x 1
#>   resampled_pr_auc
#>              <dbl>
#> 1       0.97026406

Created on 2019-01-04 by the reprex package (v0.2.1.9000)


#6

Hi, thanks for this! When I tried your code block I get:

pr_auc_per_resample <- xgb_model$pred %>%
  group_by(Resample) %>%
  pr_auc(obs, X0)
Error: all handlers should inherit from `exiting` or `inplace`

same with

> xgb_pr_curve <- xgb_model$pred %>%
+     group_by(Resample) %>%
+     pr_curve(obs, X0)
Error: all handlers should inherit from `exiting` or `inplace`

Any ideas how to get around this?


#7

You might need to update yardstick but I've actually never seen that error before. If that doesn't work, let me know

Update) This actually smells more like an rlang problem. We use rlang::with_handlers() in yardstick and I think that is where this error comes from. Try updating rlang and yardstick.


#8

Wow @davis - That's not an answer, that's blog post! Kudos :+1:


#9

Hi, thanks for the suggestion, updating rlang solved my problem here


#10

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.