ICLR2026

Why Ask One When You Can Ask kk? Learning-to-Defer to the Top-kk Experts

Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

6 citations

Abstract

Existing Learning-to-Defer (L2D) frameworks are limited to single-expert deferral, forcing each query to rely on only one expert and preventing the use of collective expertise. We introduce the first framework for Top-kk Learning-to-Defer, which allocates queries to the kk most cost-effective entities. Our formulation unifies and strictly generalizes prior approaches, including the one-stage and two-stage regimes, selective prediction, and classical cascades. In particular, it recovers the usual Top-1 deferral rule as a special case while enabling principled collaboration with multiple experts when k>1k>1. We further propose Top-k(x)k(x) Learning-to-Defer, an adaptive variant that learns the optimal number of experts per query based on input difficulty, expert quality, and consultation cost. To enable practical learning, we develop a novel surrogate loss that is Bayes-consistent, Hh\mathcal{H}_h-consistent in the one-stage setting, and (Hr,Hg)(\mathcal{H}_r,\mathcal{H}_g)-consistent in the two-stage setting. Crucially, this surrogate is independent of kk, allowing a single policy to be learned once and deployed flexibly across kk. Experiments across both regimes show that Top-kk and Top-k(x)k(x) deliver superior accuracy–cost trade-offs, opening a new direction for multi-expert deferral in L2D.