ICLR2025

Global Identifiability of Overcomplete Dictionary Learning via L1 and Volume Minimization

Yuchen Sun, Kejun Huang

Abstract

We study the theoretical properties of learning a dictionary from N signals x i ∈ R K for i = 1, . . . , N via 1 -minimization. We assume that x i 's are i.i.d. random linear combinations of the K columns from a complete (i.e., square and invertible) reference dictionary D 0 ∈ R K×K . Here, the random linear coefficients are generated from either the s-sparse Gaussian model or the Bernoulli-Gaussian model. First, for the population case, we establish a sufficient and almost necessary condition for the reference dictionary D 0 to be locally identifiable, i.e., a strict local minimum of the expected 1 -norm objective function. Our condition covers both sparse and dense cases of the random linear coefficients and significantly improves the sufficient condition by Gribonval and Schnass (2010) . In addition, we show that for a complete µ-coherent reference dictionary, i.e., a dictionary with absolute pairwise column inner-product at most µ ∈ [0, 1), local identifiability holds even when the random linear coefficient vector has up to O(µ -2 ) nonzero entries. Moreover, our local identifiability results also translate to the finite sample case with high probability provided that the number of signals N scales as O(K log K).