ICLR2026

A Faster Parameter-Free Regret Matching Algorithm

Linjian Meng, Youzhi Zhang, Shangdong Yang, Wenbin Li, Tianyu Ding, Yang Gao

摘要

Regret Matching (RM) and its variants are widely employed to learn a Nash equilibrium (NE) in large-scale games. However, most existing research only establishes a theoretical convergence rate of O(1/T)O(1/\sqrt{T}) for these algorithms in learning an NE. Recent studies have shown that smooth RM+^+ variants, the advanced variants of RM, can achieve an improved convergence rate of O(1/T)O(1/T). Despite this improvement, smooth RM+^+ variants lose the parameter-free property, i.e., no parameters that need to be tuned, a highly desirable feature in practical applications. In this paper, we propose a novel smooth RM+^+ variant called Monotone Increasing Smooth Predictive Regret Matching+^+ (MI-SPRM+^+), which retains the parameter-free property while still achieving a theoretical convergence rate of O(1/T)O(1/T). To achieve these properties, MI-SPRM+^+ employs a technology called Adaptive Regret Domain (ARD), which ensures that the lower bound for the 1-norm of accumulated regrets increases monotonically by adjusting the decision space at each iteration. This design is motivated by the observation that the range of step-sizes supporting the O(1/T)O(1/T) convergence rate in existing smooth RM+^+ variants is contingent on the lower bound for the 1-norm of accumulated regrets. Experimental results confirm that MI-SPRM+^+ empirically attains an O(1/T)O(1/T) convergence rate.