NeurIPS2022
Local Identifiability of Deep ReLU Neural Networks: the Theory
Joachim Bona-Pellissier, François Malgouyres, François Bachoc
13 citations
Abstract
Is a sample rich enough to determine, at least locally, the parameters of a neural network? To answer this question, we introduce a new local parameterization of a given deep ReLU neural network by fixing the values of some of its weights. This allows us to define local lifting operators whose inverses are charts of a smooth manifold of a high dimensional space. The function implemented by the deep ReLU neural network composes the local lifting with a linear operator which depends on the sample. We derive from this convenient representation a geometric necessary and sufficient condition of local identifiability. Looking at tangent spaces, the geometric condition provides: 1/ a sharp and testable necessary condition of identifiability and 2/ a sharp and testable sufficient condition of local identifiability. The validity of the conditions can be tested numerically using backpropagation and matrix rank computations. Inverse stability and stable recovery: Closely related to identifiability are the topics of inverse stability and stable recovery of the parameters of a network. Some negative [29] as well as positive [11, 23, 24, 25] results of inverse stability exist. The articles [23, 24, 25] examine the case of structured networks with the identity as activation function. Only [25] considers a finite X. The authors of [11] consider a general class of networks amongst which ReLU networks, but the result only holds for one-hidden-layer neural networks. Furthermore this result also requires the knowledge of f θ on a whole domain. Several stable recovery algorithms have also been proposed, for one-hidden-layer neural networks in a first place, for smooth activation function [14] , as well as ReLU in the fully-connected case