CCS2025

PoisonSpot: Precise Spotting of Clean-Label Backdoors via Fine-Grained Training Provenance Tracking

Philemon Hailemariam, Birhanu Eshete

摘要

Relying on untrusted data exposes machine learning models to backdoor attacks, where adversaries poison training data to embed hidden behaviors.Existing defenses struggle against increasingly stealthy attacks, particularly clean-label backdoor attacks, due to their inability to monitor fine-grained impact of individual training samples on model updates.In this paper, we present PoisonSpot, a novel system that precisely detects clean-label backdoor attacks by using fine-grained training provenance tracking, inspired by dynamic taint tracking.PoisonSpot captures and analyzes the impact of individual training samples on model parameter updates throughout the training process.By attributing poisoning scores to suspect samples based on their impact lineage, PoisonSpot allows for accurate identification and rejection of samples carrying backdoor triggers.We evaluate PoisonSpot on multiple benchmark datasets and attack scenarios, demonstrating its superior performance compared to the state-of-the-art clean-label backdoor poisoning defense.Poi-sonSpot consistently achieves high true positive rates, low false positive rates, and effectively mitigates backdoor attacks, even under adaptive adversarial strategies.Furthermore, PoisonSpot operates efficiently in various training settings, including retraining and fine-tuning regimes, demonstrating its robustness and scalability.