ACL2024

Data Contamination Calibration for Black-box LLMs

Wentao Ye, Jiaqi Hu, Liyao Li, Haobo Wang, Gang Chen, Junbo Zhao

Abstract

The rapid advancements of Large Language Models (LLMs) are tightly associated with the expansion of the training data size. However, the unchecked ultra-large-scale training sets introduce a series of potential risks like data contamination, i.e. the benchmark data is used for training. In this work, we propose a holistic method named Polarized Augment Calibration (PAC) along with a brand-new dataset named StackMIA to help detect the contaminated data and diminish the contamination effect. PAC extends the popular MIA (Membership Inference Attack) -from the machine learning community -by forming a more global target for detecting training data to clarify invisible training data. As a pioneering work, PAC is very much plug-and-play that can be integrated with most (if not all) current white-and black-box (for the first time) LLMs. By extensive experiments, PAC outperforms existing methods by at least 4.5%, in data contamination detection on more than 4 dataset formats, with more than 10 base LLMs. Besides, our application in real-world scenarios highlights the prominent presence of contamination and related issues. 1