ICLR2026

Random Label Prediction Heads for Studying Memorization in Deep Neural Networks

Marlon Becker, Jonas Konrad, Luis Garcia Rodriguez, Benjamin Risse

摘要

We introduce a straightforward yet effective method to empirically study memorization in deep neural networks for classification tasks. Our approach augments each training sample with auxiliary random labels, which are then predicted by a random label prediction head (RLP-head). RLP-heads can be attached at arbitrary depths of a network, predicting random labels from the corresponding intermediate representation and thereby enabling analysis of how memorization capacity evolves across layers. By interpreting the RLP-head performance as an empirical estimate of Rademacher complexity, we obtain a direct measure of both sample-level memorization and model capacity. We leverage this random label accuracy metric to analyze generalization and overfitting in different models and datasets. Building on this approach, we further propose a novel regularization technique based on the output of the RLP-head, which demonstrably reduces memorization. Interestingly, our experiments reveal that reducing memorization can either improve or impair generalization, depending on the dataset and training setup. These findings challenge the traditional assumption that overfitting is equivalent to memorization and suggest new hypotheses to reconcile these seemingly contradictory results. The source code is available at https://github.com/MarlonBecker/RandomLabelHeads . Recent work highlights the striking memorization capacity of state-of-the-art models. For instance, Zhang et al. ( 2021 ) demonstrate that modern architectures can perfectly fit datasets with randomly assigned labels, thereby achieving 100 % training accuracy in the absence of any learnable structure. In such cases, high accuracy is attainable only through memorization of individual training samples, underscoring that contemporary artificial neural networks (ANNs) can encode sample-specific and task-irrelevant information to fit each training sample individually. This ability to memorize arbitrary labels is directly connected to the model complexity. In particular, training with SGD on random labels empirically approximates Rademacher complexity, which plays a central role in deriving generalization bounds within the PAC-learning framework. The primary objective of this work is to assess the accuracy of predicting random labels as a practical metric of memorization. Although direct training on random labels reveals a model's ability to memorize, this procedure does not intrinsically inform how memorization interacts with generalization in real-world tasks and does not allow memorization mitigation. To bridge this gap, we propose a hybrid approach: we augment the network with an additional Random Label Prediction Head (RLPhead), attached to the feature extractor (i.e., all layers except the final classification layer) in parallel to the original task head, which remains unchanged. This design enables simultaneous measurement and Published as a conference paper at ICLR 2026 regularization of memorization during normal training, thereby providing a controlled way to study and modulate memorization in deep neural networks. In summary, our contribution is as follows: • We propose the use of random label prediction heads (RLP-heads) as a tool for probing layer-wise memorization in deep neural networks. • We validate that the random label accuracy derived from RLP-heads is an accurate measure for complexity and memorization. • We propose a novel regularizer that explicitly constrains memorization by penalizing the performance of the RLP-head during training. • Building on our metric and regularizer, we show how memorization can hinder or, in certain scenarios, facilitate generalization. We further hypothesize that this dual role is driven by sampling effects in the training data. RELATED WORK The phenomenon of data memorization, although not new, gained renewed attention in the era of modern deep learning with the works of Zhang et al. (2021) and Arpit et al. (2017). Traditionally, memorization was associated with model capacity and overfitting, and hence viewed primarily as a source of poor generalization. This view of capacity being responsible for overfitting has been challenged by the discovery of the double descent phenomenon (Nakkiran et al., 2021), which reveals a more nuanced relationship between capacity and generalization. Feldman (2019) formalize memorization as the ability of a model to correctly predict a label only if the sample was present in the training data. Their analysis suggests that the key obstacle to generalization is not label noise but suboptimal sampling, with many regions of the data distribution undersampled or represented by only a single example. We compare our proposed memorization metric in detail to the work of Feldman & Zhang (2020) in Appendix A.12. Even though these atypical examples in so-called long-tailed data distributions are memorized individually to reach high training performance, this memorization leads to improved generalization