NeurIPS2020

An efficient nonconvex reformulation of stagewise convex optimization problems

Rudy Bunel, Oliver Hinder, Srinadh Bhojanapalli, Krishnamurthy Dvijotham

被引用 17 次

摘要

Convex optimization problems with staged structure appear in several contexts, including optimal control, verification of deep neural networks, and isotonic regression. Off-the-shelf solvers can solve these problems but may scale poorly. We develop a nonconvex reformulation designed to exploit this staged structure. Our reformulation has only simple bound constraints, enabling solution via projected gradient methods and their accelerated variants. The method automatically generates a sequence of primal and dual feasible solutions to the original convex problem, making optimality certification easy. We establish theoretical properties of the nonconvex formulation, showing that it is (almost) free of spurious local minima and has the same global optimum as the convex problem. We modify PGD to avoid spurious local minimizers so it always converges to the global minimizer. For neural network verification, our approach obtains small duality gaps in only a few gradient steps. Consequently, it can quickly solve large-scale verification problems faster than both off-the-shelf and specialized solvers. Introduction This paper studies efficient algorithms for a particular class of stage-wise optimization problems: minimize where n and m are positive integers, S ⊆ R m , the function f has domain S × R n and range R, the functions µ i and η i have domain S × R i-1 and range R. Given a vector z, we use the notation z 1:i to denote the vector [z 1 , . . . , z i ]. We let z 1:0 be a vector of length zero. Throughout the paper we assume that η 1 , . . . , η n are proper concave functions, f , µ 1 , . . . , µ n are proper convex functions, and S is a nonempty convex set. Problems that fall into this problem class are ubiquitous. They appear in optimal control [1], finite horizon Markov decision processes with cost function controlled by an adversary [2], generalized Isotonic regression [3, 4], and verification of neural networks [5-7]. Details explaining how these problems can be written in the form of (1) are given in Appendix A. Here we briefly outline how neural network verification falls into (1b). Letting s represent the input image and z the activation values, neural networks verification can be written (unconventionally) as minimize (s,z)∈S×R n f (s, z) s.t. z i = σ([s, z 1:i-1 ] • w i ),