NDSS2026

Unshaken by Weak Embedding: Robust Probabilistic Watermarking for Dataset Copyright Protection

Shang Wang, Tianqing Zhu, Dayong Ye, Hua Ma, Bo Liu, Ming Ding, Shengfang Zhai, Yansong Gao

摘要

analytics and AI, where data is a critical determinant of model performance, particularly in the training of large language models (LLMs) [3] , [4] . However, acquiring high-quality data is non-trivial, requiring significant effort to collect and annotate it. Given that certain dataset acquisition involves domain expertise and data regulations, it is practical for model providers to purchase needed data from professional data curator, such as brokerage companies like Appen [5] and Scale AI [6] , rather than individual contributors. For example, clickworkers as data contributors can simply download the Clickworker app [7] , make a contribution and earn money from it. In this business Data as a Service (DaaS) scenario, data contributors are informed by the data curator about data usage and are compensated per order requested by model providers, as illustrated in Figure 1 . Unfortunately, as the central entity in the DaaS scenario, the data curator may exploit legitimate business processes to maximize its financial gains. Specifically, while continuing to charge the model provider for data usage, the data curator may withhold payments from data contributors and does not inform them of such transactions. Such misconduct not only compromises the interests of data contributors but also amplifies the risks of data misuse. Therefore, contributors must safeguard their copyrights to prevent the curator's unauthorized use. State-of-The-Art. Unlike model copyright protection [8]- [12] , which has been extensively studied, dataset copyright protection relies on black-box access without training control, and only a few works have explored dataset ownership verification (DOV). These methods seek to determine whether a suspicious model was trained on a given dataset, using either intrusive or non-intrusive approaches [13] . For non-intrusive DOV, methods typically extract unique characteristics from contributed datasets as fingerprints. Examples include Deep-Taster [14] and dataset-level membership inference [15]- [17] . However, they require access to model architectures or meticulously crafted auxiliary datasets, which remain key limitations in DaaS scenarios. As for intrusive DOV, watermarking methods are leveraged. They embed identifiable signals into Abstract-In modern Data-as-a-Service (DaaS) ecosystems, data curators such as data brokerage companies aggregate highquality data from many contributors and monetize it for deep learning model providers. However, malicious curators can sell valuable data but not inform their original contributors, which violates individual benefits and the law. Intrusive watermarking is one of the state-of-the-art (SOTA) techniques for protecting data copyright, and it detects whether a suspicious model carries the predefined pattern. However, these approaches face numerous limitations: struggle to work under low watermark injection rates (≤ 1.0%); performance degradation; false positives; not robust against watermarking cleansing. This work proposes an innovative intrusive watermarking approach, dubbed DIP (Data Intelligence Probabilistic Watermarking), to support dataset ownership verification while addressing the limitations above. It applies a distribution-aware sample selection algorithm, embeds probabilistic associations between watermarked samples and multiple outputs, and adopts a two-fold verification f ramework t hat l everages b oth i nference r esults and their distribution as watermark signals. Extensive experiments on 4 image and 5 text datasets demonstrate that DIP maintains the model's performance, and achieves an average watermark success rate of 89.4% at a 1% injection budget. We further validate that DIP is orthogonal to various watermarked data designs and can seamlessly integrate their strengths. Moreover, DIP proves effective across diverse modalities (image and text) and tasks (regression), with strong performance on generation tasks in large language models. DIP exhibits robustness against various adversarial environments, including 3 based on data augmentation, 3 on data cleansing, 4 on robust training and 3 on collusion-based watermark removal, while existing SOTAs fail. The source code is released at https://github.com/SixLab6/DIP .