NeurIPS2025

Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives

Qixin Zhang, Yan Sun, Can Jin, Xikun Zhang, Yao Shu, Puning Zhao, Li Shen, Dacheng Tao

被引用 3 次

摘要

In this paper, we present two effective policy learning algorithms for multi-agent online coordination(MA-OC) problem. The first one, MA-SPL, not only can achieve the optimal (1ce)(1-\frac{c}{e})-approximation guarantee for the MA-OC problem with submodular objectives but also can handle the unexplored α\alpha-weakly DR-submodular and (γ,β)(\gamma,\beta)-weakly submodular scenarios, where cc is the curvature of the investigated submodular functions, α\alpha denotes the diminishing-return(DR) ratio and the tuple (γ,β)(\gamma,\beta) represents the submodularity ratios. Subsequently, in order to reduce the reliance on the unknown parameters α,γ,β\alpha,\gamma,\beta inherent in the MA-SPL algorithm, we further introduce the second online algorithm named MA-MPL. This MA-MPL algorithm is entirely parameter-free and simultaneously can maintain the same approximation ratio as the first MA-SPL algorithm. The core of our MA-SPL and MA-MPL algorithms is a novel continuous-relaxation technique termed as policy-based continuous extension. Compared with the well-established multi-linear extension, a notable advantage of this new policy-based continuous extension is its ability to provide a lossless rounding scheme for any set function, thereby enabling us to tackle the challenging weakly submodular objectives. Finally, extensive simulations are conducted to validate the effectiveness of our proposed algorithms.