ICML2021

One-sided Frank-Wolfe algorithms for saddle problems

Vladimir Kolmogorov, Thomas Pock

5 citations

Abstract

We study a class of convex-concave saddle-point problems of the form minxmaxyKx,y+fP(x)h(y)\min_x\max_y \langle Kx,y\rangle+f_{\cal{P}}(x)-h^\ast(y) where KK is a linear operator, fPf_{\cal{P}} is the sum of a convex function ff with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope P\cal{P}, and hh^\ast is a convex (possibly nonsmooth) function. Such problem arises, for example, as a Lagrangian relaxation of various discrete optimization problems. Our main assumptions are the existence of an efficient linear minimization oracle (lmolmo) for fPf_{\cal{P}} and an efficient proximal map for hh^* which motivate the solution via a blend of proximal primal-dual algorithms and Frank-Wolfe algorithms. In case hh^* is the indicator function of a linear constraint and function ff is quadratic, we show a O(1/n2)O(1/n^2) convergence rate on the dual objective, requiring O(nlogn)O(n \log n) calls of lmolmo. If the problem comes from the constrained optimization problem minxRd{fP(x)Axb=0}\min_{x\in\mathbb R^d}\{f_{\cal{P}}(x)\:|\:Ax-b=0\} then we additionally get bound O(1/n2)O(1/n^2) both on the primal gap and on the infeasibility gap. In the most general case, we show a O(1/n)O(1/n) convergence rate of the primal-dual gap again requiring O(nlogn)O(n\log n) calls of lmolmo. To the best of our knowledge, this improves on the known convergence rates for the considered class of saddle-point problems. We show applications to labeling problems frequently appearing in machine learning and computer vision.