ICML2025
SAND: One-Shot Feature Selection with Additive Noise Distortion
Pedram Pad, Hadi Hammoud, Mohamad Dia, Nadim Maamari, Liza Andrea Dunbar
Abstract
Feature selection is a critical step in data-driven applications, reducing input dimensionality to enhance learning accuracy, computational efficiency, and interpretability. Existing state-of-theart methods often require post-selection retraining and extensive hyperparameter tuning, complicating their adoption. We introduce a novel, non-intrusive feature selection layer that, given a target feature count k, automatically identifies and selects the k most informative features during neural network training. Our method is uniquely simple, requiring no alterations to the loss function, network architecture, or post-selection retraining. The layer is mathematically elegant and can be fully described by: xi = a i x i + (1 -a i )z i where x i is the input feature, xi the output, z i a Gaussian noise, and a i trainable gain such that i a 2 i = k. This formulation induces an automatic clustering effect, driving k of the a i gains to 1 (selecting informative features) and the rest to 0 (discarding redundant ones) via weighted noise distortion and gain normalization. Despite its extreme simplicity, our method achieves competitive performance on standard benchmark datasets and a novel real-world dataset, often matching or exceeding existing approaches without requiring hyperparameter search for k or retraining. Theoretical analysis in the context of linear regression further validates its efficacy. Our work demonstrates that simplicity and performance are not mutually exclusive, offering a powerful yet straightforward tool for feature selection in machine learning.