NeurIPS2022

ADBench: Anomaly Detection Benchmark

Songqiao Han, Xiyang Hu, Hailiang Huang, Minqi Jiang, Yue Zhao

427 citations

Abstract

Given a long list of anomaly detection algorithms developed in the last few decades, how do they perform with regard to (i) varying levels of supervision, (ii) different types of anomalies, and (iii) noisy and corrupted data? In this work, we answer these key questions by conducting (to our best knowledge) the most comprehensive anomaly detection benchmark with 30 algorithms on 57 benchmark datasets, named ADBench. Our extensive experiments (98,436 in total) identify meaningful insights into the role of supervision and anomaly types, and unlock future directions for researchers in algorithm selection and design. With ADBench, researchers can efficiently conduct comprehensive and fair evaluations for newly proposed methods on the datasets (including our contributed ones from natural language and computer vision domains) against the existing baselines. To foster accessibility and reproducibility, we fully open-source ADBench and the corresponding results. Key takeaways: Through extensive experiments, we find (i) surprisingly none of the benchmarked unsupervised algorithms is statistically better than others, emphasizing the importance of algorithm * All authors contribute equally and are listed alphabetically. Direct questions to Minqi Jiang and Yue Zhao. 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.