STOC2022
Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games
Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm
被引用 16 次
摘要
Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS ‘21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is O(polylog(T)) after T repetitions of the game. In this paper we extend their result from external regret to internal and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of O( T−1 ). This substantially improves over the prior best rate of convergence of O(T−3/4) due to Chen and Peng (NeurIPS ‘20), and it is optimal up to polylogarithmic factors.