ICLR2023
Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes
Christian Haase, Christoph Hertrich, Georg Loho
3 citations
Abstract
We prove that the set of functions representable by ReLU neural networks with integer weights strictly increases with the network depth while allowing arbitrary width. More precisely, we show that ⌈log 2 (n)⌉ hidden layers are indeed necessary to compute the maximum of n numbers, matching known upper bounds. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry. The integrality assumption implies that these Newton polytopes are lattice polytopes. Then, our depth lower bounds follow from a parity argument on the normalized volume of faces of such polytopes. Published as a conference paper at ICLR 2023 Hertrich et al. ( 2021 ) conjecture that the former alternative is true. More precisely, if ReLU n (k) denotes the set of CPWL functions defined on R n and computable with k hidden layers, the conjecture can be formulated as follows: Conjecture 1 (Hertrich et al. ( 2021 )). ReLU n (k -1) ReLU n (k) for all k ≤ ⌈log 2 (n + 1)⌉. Note that ReLU n (⌈log 2 (n + 1)⌉) is the entire set of CPWL functions defined on R n by the result of Arora et al. (2018) . While Hertrich et al. (2021) provide some evidence for their conjecture, it remains open for every input dimension n ≥ 4. Even more drastically, there is not a single CPWL function known for which one can prove that two hidden layers are not sufficient to represent it. Even for a function as simple as max0, x 1 , x 2 , x 3 , x 4 , it is unknown whether two hidden layers are sufficient. In fact, max0, x 1 , x 2 , x 3 , x 4 is not just an arbitrary example. Based on a result by Wang & Sun (2005 ), Hertrich et al. (2021) show that their conjecture is equivalent to the following statement. Conjecture 2 (Hertrich et al. ( 2021 )). For n = 2 k , the function max0, x 1 , . . . , x n is not contained in ReLU n (k).