NeurIPS2025

Effective Policy Learning for Multi-Agent Online Coordination Beyond Submodular Objectives

Qixin Zhang, Yan Sun, Can Jin, Xikun Zhang, Yao Shu, Puning Zhao, Li Shen, Dacheng Tao

被引用 3 次

摘要

In this paper, we present two effective policy learning algorithms for multi-agent online coordination(MA-OC) problem. The first one, MA-SPL, not only can achieve the optimal $(1-\frac{c}{e})$ -approximation guarantee for the MA-OC problem with submodular objectives but also can handle the unexplored $\alpha$ -weakly DR-submodular and $(\gamma,\beta)$ -weakly submodular scenarios, where $c$ is the curvature of the investigated submodular functions, $\alpha$ denotes the diminishing-return(DR) ratio and the tuple $(\gamma,\beta)$ represents the submodularity ratios. Subsequently, in order to reduce the reliance on the unknown parameters $\alpha,\gamma,\beta$ inherent in the MA-SPL algorithm, we further introduce the second online algorithm named MA-MPL. This MA-MPL algorithm is entirely parameter-free and simultaneously can maintain the same approximation ratio as the first MA-SPL algorithm. The core of our MA-SPL and MA-MPL algorithms is a novel continuous-relaxation technique termed as policy-based continuous extension. Compared with the well-established multi-linear extension, a notable advantage of this new policy-based continuous extension is its ability to provide a lossless rounding scheme for any set function, thereby enabling us to tackle the challenging weakly submodular objectives. Finally, extensive simulations are conducted to validate the effectiveness of our proposed algorithms.