WWW2026

ProvGuard: Logic-Aware Multi-View Contrastive Learning for Robust and Efficient Host Threat Detection

Anyuan Sang, Li Yang, Lu Zhou, Cheng Zhou, Junbo Jia, Huipeng Yang

Abstract

The security of web services increasingly relies on accurate detection of advanced, previously unseen attacks hidden within complex host activities. Provenance-based intrusion detection systems (PIDSes) offer a promising foundation for this task by capturing rich causal and structural relationships across processes, files, and network interactions. However, recent studies show that these graph-driven methods remain vulnerable to graph manipulation attacks, where adversaries subtly alter provenance graphs to evade detection, which limits their practical deployment. To address this challenge, we present ProvGuard, a robust anomaly detection framework that couples logic-aware multi-view augmentation with contrastive representation learning. Instead of applying arbitrary structural perturbations, ProvGuard employs Logic-Aware Noise Injection (LNI) to generate semantically valid graph views that preserve the causal semantics of provenance data. These views are then leveraged in a Logic-Preserving Contrastive Learning module, enabling the model to learn representations invariant to benign transformations yet sensitive to adversarial inconsistencies. Extensive evaluations on multiple provenance datasets show that ProvGuard surpasses state-of-the-art detectors in resisting graph manipulation attacks while maintaining high detection accuracy and efficiency, achieving an average F1-score above 96% with less than a 10% AUC drop.