ICLR2026

A Faster Parameter-Free Regret Matching Algorithm

Linjian Meng, Youzhi Zhang, Shangdong Yang, Wenbin Li, Tianyu Ding, Yang Gao

摘要

Regret Matching (RM) and its variants are widely employed to learn a Nash equilibrium (NE) in large-scale games. However, most existing research only establishes a theoretical convergence rate of $O(1/\sqrt{T})$ for these algorithms in learning an NE. Recent studies have shown that smooth RM $^+$ variants, the advanced variants of RM, can achieve an improved convergence rate of $O(1/T)$ . Despite this improvement, smooth RM $^+$ variants lose the parameter-free property, i.e., no parameters that need to be tuned, a highly desirable feature in practical applications. In this paper, we propose a novel smooth RM $^+$ variant called Monotone Increasing Smooth Predictive Regret Matching $^+$ (MI-SPRM $^+$ ), which retains the parameter-free property while still achieving a theoretical convergence rate of $O(1/T)$ . To achieve these properties, MI-SPRM $^+$ employs a technology called Adaptive Regret Domain (ARD), which ensures that the lower bound for the 1-norm of accumulated regrets increases monotonically by adjusting the decision space at each iteration. This design is motivated by the observation that the range of step-sizes supporting the $O(1/T)$ convergence rate in existing smooth RM $^+$ variants is contingent on the lower bound for the 1-norm of accumulated regrets. Experimental results confirm that MI-SPRM $^+$ empirically attains an $O(1/T)$ convergence rate.