ICSE2025
Weakly-Supervised Log-Based Anomaly Detection with Inexact Labels via Multi-Instance Learning
Minghua He, Tong Jia, Chiming Duan, Huaqian Cai, Ying Li, Gang Huang
被引用 9 次
摘要
Log-based anomaly detection is essential for maintaining software availability. However, existing log-based anomaly detection approaches heavily rely on fine-grained exact labels of log entries which are very hard to obtain in real-world systems. This brings a key problem that anomaly detection models require supervision signals while labeled log entries are unavailable. Facing this problem, we propose a new labeling strategy called inexact labeling that instead of labeling an log entry, system experts can label a bag of log entries in a time span. Furthermore, we propose MIDLog, a weakly supervised log-based anomaly detection approach with inexact labels. We leverage the multiinstance learning paradigm to achieve explicit separation of anomalous log entries from the inexact labeled anomalous log set so as to deduce exact anomalous log labels from inexact labeled log sets. Extensive evaluation on three public datasets shows that our approach achieves an F1 score of over 85% with inexact labels.