ICML2024

Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification

Martin Mihelich, François Castagnos, Charles Dognin

摘要

In this paper, we present two key theorems that should have significant implications for machine learning practitioners working with binary classification models. The first theorem provides a formula to calculate the maximum and minimum Precision-Recall AUC (AU C P R ) for a fixed Receiver Operating Characteristic AUC (AU C ROC ), demonstrating the variability of AU C P R even with a high AU C ROC . This is particularly relevant for imbalanced datasets, where a good AU C ROC does not necessarily imply a high AU C P R . The second theorem inversely establishes the bounds of AU C ROC given a fixed AU C P R . Our findings highlight that in certain situations, especially for imbalanced datasets, it is more informative to prioritize AU C P R over AU C ROC . Additionally, we introduce a method to determine when a higher AU C ROC in one model implies a higher AU C P R in another and vice versa, streamlining the model evaluation process.