NeurIPS2022

Faster Stochastic Algorithms for Minimax Optimization under Polyak-ojasiewicz Condition

Lesi Chen, Boyuan Yao, Luo Luo

22 citations

Abstract

This paper considers stochastic first-order algorithms for minimax optimization under Polyak--ojasiewicz (PL) conditions. We propose SPIDER-GDA for solving the finite-sum problem of the form $\min_x \max_y f(x,y)\triangleq \frac{1}{n} \sum_{i=1}^n f_i(x,y)$ , where the objective function $f(x,y)$ is $\mu_x$ -PL in $x$ and $\mu_y$ -PL in $y$ ; and each $f_i(x,y)$ is $L$ -smooth. We prove SPIDER-GDA could find an $\epsilon$ -optimal solution within ${\mathcal O}\left((n + \sqrt{n}\,\kappa_x\kappa_y^2)\log (1/\epsilon)\right)$ stochastic first-order oracle (SFO) complexity, which is better than the state-of-the-art method whose SFO upper bound is ${\mathcal O}\big((n + n^{2/3}\kappa_x\kappa_y^2)\log (1/\epsilon)\big)$ , where $\kappa_x\triangleq L/\mu_x$ and $\kappa_y\triangleq L/\mu_y$ . For the ill-conditioned case, we provide an accelerated algorithm to reduce the computational cost further. It achieves $\tilde{{\mathcal O}}\big((n+\sqrt{n}\,\kappa_x\kappa_y)\log (\kappa_y/\epsilon) \log(1/\epsilon)\big)$ SFO upper bound when $\kappa_y \gtrsim \sqrt{n}$ . Our ideas can also be applied to a more general setting where the objective function only satisfies the PL condition for one variable. Numerical experiments validate the superiority of proposed methods.