ICLR2025
ADMM for Nonconvex Optimization under Minimal Continuity Assumption
Ganzhao Yuan
摘要
This paper introduces a novel approach to solving multi-block nonconvex composite optimization problems through a proximal linearized Alternating Direction Method of Multipliers (ADMM). This method incorporates an Increasing Penalization and Decreasing Smoothing (IPDS) strategy. Distinguishing itself from existing ADMM-style algorithms, our approach (denoted IPDS-ADMM) imposes a less stringent condition, specifically requiring continuity in just one block of the objective function. IPDS-ADMM requires that the penalty increases and the smoothing parameter decreases, both at a controlled pace. When the associated linear operator is bijective, IPDS-ADMM uses an over-relaxation stepsize for faster convergence; however, when the linear operator is surjective, IPDS-ADMM uses an under-relaxation stepsize for global convergence. We devise a novel potential function to facilitate our convergence analysis and prove an oracle complexity O(ǫ -3 ) to achieve an ǫ-approximate critical point. To the best of our knowledge, this is the first complexity result for using ADMM to solve this class of nonsmooth nonconvex problems. Finally, some experiments on the sparse PCA problem are conducted to demonstrate the effectiveness of our approach. 1 Note a: hn = 0 denotes that the n-th block has no non-smooth part, making the objective function smooth. Note b: The iteration complexity relies on the variational inequality of the convex problem. Note c: We adapt their application model into our optimization framework in Equation ( 1 ) with (L, S, Z) = (x1, x2, x3), as their model additionally requires the linear operator for the first two blocks to be injective. Note d: This paper studies manifold optimization with a fixed large penalty and small stepsize. ◮ Assumptions. Through this paper, we impose the following assumptions on Problem (1). Lemma 1.2.3 in (Nesterov, 2003)). Assumption 1.2. The functions f n (•) and h n (•) are Lipschitz continuous with some constants C f and C h , satisfying ∇f n (x n ) ≤ C f and ∂h n (x n ) ≤ C h for all x n .