ICLR2026

Boosting for Predictive Sufficiency

Abbavaram Gowtham Reddy, Rajeev Verma, Celia Rubio-Madrigal, Krikamol Muandet, Rebekka Burkholz

Abstract

Out-of-distribution (OOD) generalization is a defining hallmark of truly robust and reliable machine learning systems. Recently, it has been empirically observed that existing OOD generalization methods often underperform on real-world tabular data, where hidden confounding shifts drive distribution shifts that boosting models handle more effectively. Earlier work attributes a part of boosting's success to variance reduction, handling missing covariates, feature selection, and connections to multicalibration. Complementary to these explanations, we uncover a crucial reason behind boosting's success in OOD generalization: its ability to identify environments created by hidden confounding shifts and maximize predictive performance within those environments. To this end, this paper introduces an information-theoretic notion called α-predictive sufficiency and formalizes its connection to OOD generalization under hidden confounding shift. We show that boosting implicitly identifies suitable environments and produces an α-predictive sufficient predictor. We validate our theoretical results through synthetic and realworld experiments and show that boosting achieves robust performance by identifying these environments and maximizing the mutual information between predictions and true outcomes.