STOC2022

Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games

Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm

被引用 16 次

摘要

Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS ‘21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is O(polylog(T)) after T repetitions of the game. In this paper we extend their result from external regret to internal and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium at the rate of O( T−1 ). This substantially improves over the prior best rate of convergence of O(T−3/4) due to Chen and Peng (NeurIPS ‘20), and it is optimal up to polylogarithmic factors.