KDD2022

Model Degradation Hinders Deep Graph Neural Networks

Wentao Zhang, Zeang Sheng, Ziqi Yin, Yuezihan Jiang, Yikuan Xia, Jun Gao, Zhi Yang, Bin Cui

被引用 42 次

摘要

Graph Neural Networks (GNNs) have achieved great success in various graph mining tasks. However, drastic performance degradation is always observed when a GNN is stacked with many layers. As a result, most GNNs only have shallow architectures, which limits their expressive power and exploitation of deep neighborhoods. Most recent studies attribute the performance degradation of deep GNNs to the over-smoothing issue. In this paper, we disentangle the conventional graph convolution operation into two independent operations: Propagation (P) and Transformation (T). Following this, the depth of a GNN can be split into the propagation depth (𝐷 𝑝 ) and the transformation depth (𝐷 𝑡 ). Through extensive experiments, we find that the major cause for the performance degradation of deep GNNs is the model degradation issue caused by large 𝐷 𝑡 rather than the over-smoothing issue mainly caused by large 𝐷 𝑝 . Further, we present Adaptive Initial Residual (AIR), a plug-and-play module compatible with all kinds of GNN architectures, to alleviate the model degradation issue and the over-smoothing issue simultaneously. Experimental results on six real-world datasets demonstrate that GNNs equipped with AIR outperform most GNNs with shallow architectures owing to the benefits of both large 𝐷 𝑝 and 𝐷 𝑡 , while the time costs associated with AIR can be ignored. CCS CONCEPTS • Computing methodologies → Machine learning; • Mathematics of computing → Graph algorithms.