ICLR2026

How Stable is the Next Token? A Geometric View of LLM Prediction Stability

Deyuan Liu, Zecheng Wang, Zhanyue Qin, Zhiying Tu, Dianhui Chu, Dianbo Sui

Abstract

Large Language Models (LLMs) exhibit impressive capabilities yet suffer from sensitivity to slight input context variations, hampering reliability. Conventional metrics like accuracy and perplexity fail to assess local prediction robustness, as normalized output probabilities can obscure the underlying resilience of an LLM's internal state to perturbations. We introduce the Token Constraint Bound (δTCB\delta_{\mathrm{TCB}}), a novel metric that quantifies the maximum internal state perturbation an LLM can withstand before its dominant next-token prediction significantly changes. Intrinsically linked to output embedding space geometry, δTCB\delta_{\mathrm{TCB}} provides insights into the stability of the model's internal predictive commitment. Our experiments show δTCB\delta_{\mathrm{TCB}} correlates with effective prompt engineering and uncovers critical prediction instabilities missed by perplexity during in-context learning and text generation. δTCB\delta_{\mathrm{TCB}} offers a principled, complementary approach to analyze and potentially improve the contextual stability of LLM predictions.