Sensitivity for several prediction scores (medical)

Hey there,

I'm new to the R-Community so hopefully this is the right place to ask the question.

Currently I'm working on a medical database. Checking on the performance of several scores to predict a disease. In the dataset I got the variables

  • disease: either "true" or "false"
  • several scores (called "cutoff"): either 1 (= predicted false) or 2 (= predicted true)

I simply wanted to compute sensitivity, specifity and so on for every score. Computing them was pretty easy but I wanted a table like:

         sens      spec
cutoff1   0.6      0.5 
cutoff2   0.7      0.4
cutoff3   0.8      0.6

as a result.
In additon the code should be able to take new scores (cutoffs) into account that might appear in the future.

So after tinkering around I decided to create a function based an add_column and add_row that takes 2 arguments

  1. the database to be analyzed
  2. a sequence of strings where every element is the name of the score (cutoff).

This finally led to my function:

# Step1:  create an empty data frame
sensitivity_function <- function(db,cutoffs){
  # create the table that will be the output
  table <- data.frame() %>% add_column(
    cutoff_name = 0,
    true_pos = 0,
    false_neg = 0,
    false_pos = 0,
    true_neg = 0,
    cond_pos = 0,
    pred_pos = 0,
    cond_neg = 0,
    pred_neg = 0,
    sensitivity = 0,
    specifity = 0,
    false_pos_rate = 0,
    false_neg_rate = 0
  )
  
  # iterate over all cutoffs
  for(cutoff in cutoffs){
    # create the variables I need
    db2 <- db %>% summarize(
      true_pos = sum(disease == "true" & as.numeric(.data[[cutoff]]) == 2),
      false_neg = sum(disease == "true" & as.numeric(.data[[cutoff]]) == 1),
      false_pos = sum(disease != "true" & as.numeric(.data[[cutoff]]) == 2),
      true_neg = sum(disease != "true" & as.numeric(.data[[cutoff]]) == 1),
      cond_pos = sum(disease == "true"),
      pred_pos = sum(as.numeric(.data[[cutoff]]) == 2),
      cond_neg = sum(disease != "true"),
      pred_neg = sum(as.numeric(.data[[cutoff]]) == 1),
      
      sensitivity = round(true_pos / cond_pos, 2),
      specifity = round(true_neg / cond_neg, 2),
      false_pos_rate = round(false_pos / cond_neg, 2),
      false_neg_rate = round(false_neg / cond_pos, 2)
    )
    
    # add those values for the current cutoff to the table
    table <- table %>% add_row(
      cutoff_name = cutoff,
      true_pos = db2$true_pos[1],
      false_neg = db2$false_neg[1],
      false_pos = db2$false_pos[1],
      true_neg = db2$true_neg[1],
      cond_pos = db2$cond_pos[1],
      pred_pos = db2$pred_pos[1],
      cond_neg = db2$cond_neg[1],
      pred_neg = db2$pred_neg[1],
      sensitivity = db2$sensitivity[1],
      specifity = db2$specifity[1],
      false_pos_rate = db2$false_pos_rate[1],
      false_neg_rate = db2$false_neg_rate[1]
    )
  }
  
  # put out the table
  table
}

# example how to run it
cutoffs <- c("cutoff1","cutoff2","cutoff3")
table <- sensitivity_function(db,cutoffs)

table[c(1,seq(10,13))]

gives

           cutoff_name sensitivity specifity false_pos_rate false_neg_rate
1         cutoff1             0.82      0.69           0.31           0.18
2         cutoff2             0.81      0.71           0.29           0.19
3         cutoff3             0.73      0.77           0.23           0.27

Is there an easier option to do it?

Thank you!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.