ICML2025

Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update

Jing Wang, Yu-Jie Zhang, Peng Zhao, Zhi-Hua Zhou

摘要

We study the stochastic linear bandits with heavytailed noise. Two principled strategies for handling heavy-tailed noise, truncation and medianof-means, have been introduced to heavy-tailed bandits. Nonetheless, these methods rely on specific noise assumptions or bandit structures, limiting their applicability to general settings. The recent work (Huang et al., 2023) develops a soft truncation method via the adaptive Huber regression to address these limitations. However, their method suffers undesired computational costs: it requires storing all historical data and performing a full pass over these data at each round. In this paper, we propose a one-pass algorithm based on the online mirror descent framework. Our method updates using only current data at each round, reducing the per-round computational cost from O(t log T ) to O(1) with respect to current round t and the time horizon T , and achieves a near-optimal and variance-aware regret of order where d is the dimension and ν 1+ε t is the (1 + ε)-th central moment of reward at round t.