ICLR2026

Bilevel Optimization with Lower-Level Uniform Convexity: Theory and Algorithm

Yuman Wu, Xiaochuan Gong, Jie Hao, Mingrui Liu

被引用 1 次

摘要

Bilevel optimization is a hierarchical framework where an upper-level optimization problem is constrained by a lower-level problem, commonly used in machine learning applications such as hyperparameter optimization. Existing bilevel optimization methods typically assume strong convexity or Polyak-Łojasiewicz (PL) conditions for the lower-level function to establish non-asymptotic convergence to a solution with small hypergradient. However, these assumptions may not hold in practice, and recent work (Chen et al., 2024) has shown that bilevel optimization is inherently intractable for general convex lower-level functions with the goal of finding small hypergradients. In this paper, we identify a tractable class of bilevel optimization problems that interpolates between lower-level strong convexity and general convexity via lowerlevel uniform convexity. For uniformly convex lower-level functions with exponent p ≥ 2, we establish a novel implicit differentiation theorem characterizing the hyperobjective's smoothness property. Building on this, we design a new stochastic algorithm, termed UniBiO, with provable convergence guarantees, based on an oracle that provides stochastic gradient and Hessian-vector product information for the bilevel problems. Our algorithm achieves O(ϵ -5p+6 ) oracle complexity bound for finding ϵ-stationary points. Notably, our complexity bounds match the optimal rates in terms of the ϵ dependency for strongly convex lower-level functions (p = 2), up to logarithmic factors. Our theoretical findings are validated through experiments on synthetic tasks and data hyper-cleaning, demonstrating the effectiveness of our proposed algorithm. where f and g are referred to as upper-level and lower-level functions respectively. A common assumption in bilevel optimization is that the lower-level function is either strongly convex (Ghadimi