NeurIPS2024

Optimizing over Multiple Distributions under Generalized Quasar-Convexity Condition

Shihong Ding, Long Yang, Luo Luo, Cong Fang

Abstract

We study a typical optimization model where the optimization variable is composed of multiple probability distributions. Though the model appears frequently in practice, such as for policy problems, it lacks specific analysis in the general setting. For this optimization problem, we propose a new structural condition/landscape description named generalized quasar-convexity (GQC) beyond the realms of convexity. In contrast to original quasar-convexity (Hinder, Sidford, & Sohoni, 2020) , GQC allows an individual quasar-convex parameter γi for each variable block i and the smaller of γi implies less block-convexity. To minimize the objective function, we consider a generalized oracle termed as the internal function that includes the standard gradient oracle as a special case. We provide optimistic mirror descent (OMD) for multiple distributions and prove that the algorithm can achieve an adaptive Õ(( d i=1 1/γi)ε -1 ) iteration complexity to find an ε-suboptimal global solution without pre-known the exact values of γi when the objective admits "polynomial-like" structural. Notably, it achieves iteration complexity that does not explicitly depend on the number of distributions and strictly faster ( d i=1 1/γi v.s. d max i∈[1:d] 1/γi) than mirror decent methods. We also extend GQC to the minimax optimization problem proposing the generalized quasar-convexity-concavity (GQCC) condition and a decentralized variant of OMD with regularization. Finally, we show the applications of our algorithmic framework on discounted Markov Decision Processes problem and Markov games, which bring new insights on the landscape analysis of reinforcement learning. Solution type Related work Iteration complexity Single loop ε-approximate NE Cen, Wei, and Chi (2021) Z. Chen, Ma, and Zhou ( 2021 ) Comparison of policy optimization methods for finding an ε-approximate NE of infinite horizon two-player zero-sum Markov games in terms of the max-min gap (see Eq. ( 4 )). Since the iteration complexity of several research works (such as Zhao et al. (2022 ), Alacaoglu et al. (2022) and Zeng et al. ( 2022 )) involve concentrability coefficient and initial distribution mismatch coefficient, we will not delve into them here. Contribution (A) We introduce new structural conditions GQC for minimization problems and GQCC for minimax problems over multiple distributions. (B) We provide adaptive algorithm that achieves O(( d i=1 1/γ i )ε -1 ) iteration complexities to find an ε-suboptimal global minimum of "polynomial-like" function under GQC. We also provide an implementable minimax algorithm, given a generalized quasar-convex-concave function with proper conditions, uses O((1 -θ) -2.5 max z∈Z ( d i=1 ψ i (z)) ε -1 ) iterations to find an ε-approximate Nash equilibrium. (C) We show that discounted MDP and infinite horizon two-player zero-sum Markov games admit the GQC and GQCC conditions, respectively, and also satisfy our mild assumptions. In addition, we provide O((1 -θ) -2.5 ε -1 ) iteration bound for finding an ε-approximate Nash equilibrium of infinite horizon two-player zero-sum Markov games. Detailed comparisons between our method and prior arts are provided in Table 1 . Related Works Minimization: Convexity condition has been studied at length and plays a critical role in optimizing minimization problems (