ICML2025

Calibrated Language Models and How to Find Them with Label Smoothing

Jerry Huang, Peng Lu, Qiuhao Zeng

Abstract

Recent advances in natural language processing have enabled the fine-tuning of large language models (LLMs) into powerful interactive agents with improved instruction-following ability. However, this can impact confidence calibration for reliable model output, which has not been researched in full. In this work, we examine various open-sourced LLMs, where we identify significant calibration degradation after instruction tuning. Seeking a practical solution, we look towards label smoothing, which has been shown as an effective method to regularize for overconfident predictions but has yet to be widely adopted in the supervised fine-tuning (SFT) of LLMs. We provide insight into why label smoothing can maintain calibration throughout the SFT process, but identify settings remain where the effectiveness of smoothing is severely diminished. We posit the cause to stem from the ability to become overconfident, which has a direct relationship with the hidden and vocabulary size of models, which we justify theoretically and experimentally. Finally, we address an outstanding issue regarding the memory footprint of the cross-entropy loss computation with label smoothing, designing a customized kernel to dramatically reduce memory consumption without sacrificing speed or performance in comparison to existing solutions.