KDD2024

Pre-trained KPI Anomaly Detection Model Through Disentangled Transformer

Zhaoyang Yu, Changhua Pei, Xin Wang, Minghua Ma, Chetan Bansal, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang, Xidao Wen, Jianhui Li, Gaogang Xie, Dan Pei

8 citations

DOI Publisher

Abstract

In large-scale online service systems, numerous Key Performance Indicators (KPIs), such as service response time and error rate, are gathered in a time-series format. KPI Anomaly Detection (KAD) is a critical data mining problem due to its widespread applications in real-world scenarios. However, KAD faces the challenges of dealing with KPI heterogeneity and noisy data. We propose KAD-Disformer, a KPI Anomaly Detection approach through Disentangled Transformer. KAD-Disformer pre-trains a model on existing accessible KPIs, and the pre-trained model can be effectively "fine-tuned" to unseen KPI using only a handful of samples from the unseen KPI. We propose a series of innovative designs, including disentangled projection for transformer, unsupervised few-shot fine-tuning (uTune), and denoising modules, each of which significantly contributes to the overall performance. Our extensive experiments demonstrate that KAD-Disformer surpasses the state-of-the-art universal anomaly detection model by 13% in F1-score and achieves comparable performance using only 1/8 of the finetuning samples saving about 25 hours. KAD-Disformer has been successfully deployed in the real-world cloud system serving millions of users, attesting to its feasibility and robustness. Our code is available at https://github.com/NetManAIOps/KAD-Disformer.