ICML2025

On Temperature Scaling and Conformal Prediction of Deep Classifiers

Lahav Dabah, Tom Tirer

Abstract

Key Properties • Works with any model (black-box access) • Acts as a post-processing step Evaluating Conformal Prediction? 1. AvgSize -average size of prediction sets 2. TopCovGap -worst-case gap in coverage across classes. *Note the for both metrics -the lower the better Key Idea Outputs a set of possible classes that is guaranteed to contain the true label with a user-defined confidence level.