Skip to main content

Table 1 Performance characteristics of binary tests and continuous prediction models with various degrees of miscalibration. All values given were calculated directly from the formulae in the text and independently verified using a simulation approach (Appendix)

From: The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models

     

Net benefit

Test

Specificity

Sensitivity

AUC

Brier score

Threshold: 5%

Threshold: 10%

Threshold: 20%

Binary tests

 Assume all negative

100%

0%

0.500

0.2000

0.0000

0.0000

0.0000

 Assume all positive

0%

100%

0.500

0.8000

0.1579

0.1111

0.0000

 Highly specific

95%

50%

0.725

0.1400*

0.1169

0.0979

0.0956

0.0900

 Highly sensitive

50%

95%

0.725

0.4100*

0.1386

0.1689

0.1456

0.0900

Continuous prediction models

 Well calibrated

0.75

0.1386

0.1595

0.1236

0.0716

 Overestimating risk

0.75

0.1708

0.1583

0.1160

0.0423

 Underestimating risk

0.75

0.1540

0.1483

0.0986

0.0413

 Severely underestimating risk

0.75

0.1760

0.0921

0.0372

0.0076

  1. AUC, Brier score, and net benefit for various threshold probabilities corresponding to binary tests and continuous prediction models with various degrees of miscalibration predicting an outcome with prevalence of 20%, as shown in Fig. 1. Higher values of AUC and net benefit are desirable whereas lower values of the Brier score are desirable
  2. *Method 1 calculation: binary test is considered to produce probabilities of 1 and 0 for a positive and negative test, respectively
  3. Method 2 calculation: binary test is considered to produce probabilities of the positive predictive value and 1 − negative predictive value for a positive and negative test, respectively