ICLR2025

The impact of allocation strategies in subset learning on the expressive power of neural networks

Ofir Schlisselberg, Ran Darshan

摘要

We propose a new approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute. Our approach is based on an interrelated set of measures of expressivity, unified by the novel notion of trajectory length, which measures how the output of a network changes as the input sweeps along a one-dimensional path. Our findings can be summarized as follows: (1) The complexity of the computed function grows exponentially with depth. We design measures of expressivity that capture the non-linearity of the computed function. Due to how the network transforms its input, these measures grow exponentially with depth. (2) All weights are not equal (initial layers matter more). We find that trained networks are far more sensitive to their lower (initial) layer weights: they are much less robust to noise in these layer weights, and also perform better when these weights are optimized well. (3) Trajectory Regularization works like Batch Normalization. We find that batch norm stabilizes the learnt representation, and based on this propose a new regularization scheme, trajectory regularization.