NeurIPS2024
Gradient Methods for Online DR-Submodular Maximization with Stochastic Long-Term Constraints
Guanyu Nie, Vaneet Aggarwal, Christopher J. Quinn
Abstract
In this paper, we consider the problem of online monotone DR-submodular maximization subject to long-term stochastic constraints. Specifically, at each round t ∈ [ T ] , after committing an action x t , a random reward f t ( x t ) and an unbiased gradient estimate of the point (cid:101) ∇ f t ( x t ) (semi-bandit feedback) are revealed. Meanwhile, a budget of g t ( x t ) , which is linear and stochastic, is consumed of its total allotted budget B T . We propose a gradient ascent based algorithm that achieves 12 -regret of O ( √ T ) with O ( T 3 / 4 ) constraint violation with high probability. Moreover, when first-order full-information feedback is available, we propose an algorithm that achieves (1 − 1 /e ) -regret of O ( √ T ) with O ( T 3 / 4 ) constraint violation. These algorithms significantly improve over the state-of-the-art in terms of query complexity.