WWW2026

The Power of Penalties: Negativity-Aware Incentives for High-Quality Crowdsourced Data Labeling

Kai Wang, Runze Wu, Yu Xiong, Haifeng Sun, Anran Li, Shaojie Tang, Changjie Fan, Xiang-Yang Li

摘要

High-quality data labeling is essential for training robust machine learning models; however, existing methods often ignore fraud or assume non-negative worker utility, failing to penalize harmful contributions without discouraging participation. To address this, we propose the Negativity-Aware Incentive (NAI) mechanism which introduces two novel components. First, the Ability-Result Characteristic Function (AR-CF) adapts and extends Shapley value theory through signed valuation to explicitly capture both positive and negative contributions, by combining workers' abilities with real-time task results to define contribution values. Second, a dynamic stake pool mechanism employs pre-commitment economics with adaptive dual-control parameters to balance fairness and operational efficiency. Through extensive experiments on multimodal datasets (images, text, audio, video), NAI outperforms state-of-the-art baselines: it improves video labeling accuracy by 16.6%, and reduces fraudulent behaviors by 33.9%. Furthermore, our deployment on the NetEase Youling crowdsourcing platform, serving 430,000 registered workers with 80,000 monthly active workers, validates NAI's real-world viability. Real-time A/B testing shows a 59.6% improvement in labeling quality for beginner tasks and a consistent reduction in fraud rates (14.8%-33.9%) across difficulty levels. This work establishes a paradigm shift in crowdsourcing system design, demonstrating that explicit negative modeling can enhance data quality, optimize costs, and foster participation at scale.