ICLR2026
Deep-ICE: The first globally optimal algorithm for empirical risk minimization of two-layer maxout and ReLU networks
Xi He, Yi Miao, Max A Little
被引用 2 次
摘要
This paper introduces the first globally optimal algorithm for the empirical risk minimization problem of two-layer maxout and ReLU networks, i.e., minimizing the number of misclassifications. The algorithm has a worst-case time complexity of O N DK+1 , where K denotes the number of hidden neurons and D represents the number of features. It can be can be generalized to accommodate arbitrary computable loss functions without affecting its computational complexity. Our experiments demonstrate that the proposed algorithm provides provably exact solutions for small-scale datasets. To handle larger datasets, we introduce a heuristic method that reduces the data size to a manageable scale, making it feasible for our algorithm. This extension enables efficient processing of largescale datasets and achieves significantly improved performance in both training and prediction, compared to state-of-the-art approaches (neural networks trained using gradient descent and support vector machines), when applied to the same models (two-layer networks with fixed hidden nodes and linear models). The artifacts of the Deep-ICE algorithm can be found in https://github. com/XiHegrt/DeepICE-algorithm-artifacts . * Designed the core algorithms, provided theoretical proofs, conducted the main experiments, and wrote the manuscript. † Implemented the CUDA version of Deep-ICE algorithm, and co-investigated the ordered generation and memory-free techniques. ‡ Initiated the project and provided supervision and critical feedback throughout the research and writing process.