NeurIPS2024

GraphTrail: Translating GNN Predictions into Human-Interpretable Logical Rules

Burouj Armgaan, Manthan Dalmia, Sourav Medya, Sayan Ranu

Abstract

Instance-level explanation of graph neural networks (GNNs) is a well-studied area. These explainers, however, only explain an instance (e.g., a graph) and fail to uncover the combinatorial reasoning learned by a GNN from the training data towards making its predictions. In this work, we introduce GRAPHTRAIL, the first end-to-end, post-hoc, global GNN explainer that translates the functioning of a black-box GNN model to a boolean formula over the (sub)graph-level concepts without relying on local explainers. GRAPHTRAIL is unique in automatically mining the discriminative subgraph-level concepts using Shapley values. Subsequently, the GNN predictions are mapped to a human-interpretable boolean formula over these concepts through symbolic regression. Extensive experiments across diverse datasets and GNN architectures demonstrate significant improvement over existing global explainers in mapping GNN predictions to faithful logical formulae. The robust and accurate performance of GRAPHTRAIL makes it invaluable for improving GNNs and facilitates adoption in domains with strict transparency requirements. global explainer that (1) mines the subgraph concepts used by a black-box GNN model, and then (2) uncovers the boolean logic used by the GNN over these concepts to make its predictions. Works on global GNN explainers are limited [56, 16, 54, 2] . XGNN [56] and GNNInterpreter [50] are generative-modeling based global explainers. Both generate a graph that maximally aligns with a specified class label. While this generated graph likely contains important features used by the GNN in making its predictions, it may not actually be present in the dataset, limiting its utility for analyzing specific predictions. Additionally, it does not produce a human-interpretable rule explaining the class attributions made by the GNN. Finally, XGNN [56] requires domain-specific validity-rules as input, which affects its generalizability and results in inferior performance [2] . Another work [54] evaluates which base concepts are detected by the neurons of a GNN model during predictions and their relative importance. These concepts can be subgraphs or node-level properties, such as degrees. However, this method lacks the ability to automatically mine the concepts. More importantly, this method also does not generate a human-interpretable rule for the decision-making process of the GNN. The closest work to ours is GLGEXPLAINER [2] , which shares the objective of providing a global explanation of the GNN through a boolean formula over subgraph-level concepts. However, there are significant limitations that need to be addressed: • Dependency on instance-level explainers: GLGEXPLAINER does not mine the subgraph concepts in an end-to-end manner. Instead, it relies on an instance-level explainer (e.g. PGEXPLAINER [29]) to provide these concepts, over which it searches for the combinatorial formula mapping to the GNN predictions. This dependency creates a disconnect with the objective, as instance-level explainers lack a global understanding of the model. Our proposed approach develops an end-to-end pipeline that mines concepts based on global trends. • Lack of interpretability due to vector-level concepts: In GLGEXPLAINER, each concept in the formula corresponds to a feature vector and not a subgraph. These vectors represent the embedding of a cluster of subgraphs generated by the instance explainer. Hence, in its original form, the formula is not human-interpretable. To convert into a human-interpretable formula, GLGEXPLAINER randomly selects a subgraph from the cluster, assuming all subgraphs in a cluster are similar. Our investigation ( § 4) reveals that this assumption is rarely true in practice, compromising both interpretability and efficacy. • Lack of robustness: GLGEXPLAINER shows significant variation in the formula based on the training split used. As our analysis in § 4 reveals, due to the reliance on instance-level explanations, even when data are drawn from the same distribution, the base concept candidates vary, and consequently so does the eventual formula. At this juncture, we note that our work is distinct from the line of research on explainable GNNs [61, 10, 34] . Explainable GNNs are designed to make explainable predictions rather than explaining the predictions of a black-box GNN.