KDD2025

Saliency-Bench: A Comprehensive Benchmark for Evaluating Visual Explanations

Yifei Zhang, James Song, Siyi Gu, Tianxu Jiang, Bo Pan, Guangji Bai, Liang Zhao

被引用 2 次

摘要

Explainable AI (XAI) has gained significant attention for providing insights into the decision-making processes of deep learning models, particularly for image classification tasks through saliency-based visual explanations. Despite their success, key challenges persist due to the scarcity of annotated datasets and the absence of standardized evaluation protocols. In this paper, we introduce Saliency-Bench, a novel benchmark designed to evaluate visual explanations generated by saliency methods across multiple datasets. We curated, constructed, and annotated eight datasets, each covering diverse tasks such as scene classification, cancer diagnosis, object classification, and action classification, with corresponding ground-truth explanation annotations. The benchmark includes a standardized and unified evaluation pipeline for assessing faithfulness and alignment of the visual explanation, providing a holistic visual explanation performance assessment. We benchmark these eight datasets with widely used saliency methods on different image classifier architectures to evaluate explanation quality. Additionally, we developed an user-friendly toolkit for automating the evaluation pipeline, from data accessing, and data loading, to result evaluation. The benchmark is available at https://github.com/XAIdataset/XAIdataset.github.io.