NeurIPS2024

Piecewise-Stationary Bandits with Knapsacks

Xilin Zhang, Wang Chi Cheung

摘要

We propose a novel inventory reserving algorithm which draws new insights into Bandits with Knapsacks (Bwk) problems in piecewise-stationary environments. Suppose parameters η min , η max P p 0 , 1 s respectively lower and upper bound the ratio between the reward earned and the resources consumed in a round. Our algo-rithm achieves a provably near-optimal competitive ratio of O p log p η max η min qq , with a matching lower bound provided. Our performance guarantee is based on a dynamic benchmark that upper bounds the optimum, different from existing works on adversarial Bwk Immorlica et al. (2019); Kesselheim and Singla (2020) who compare with the stationary benchmark. Different from existing non-stationary Bwk work Liu et al. (2022), we do not require a bounded global variation.