ASE2023

Dynamic Graph Neural Networks-Based Alert Link Prediction for Online Service Systems

Yiru Chen, Chenxi Zhang, Zhen Dong, Dingyu Yang, Xin Peng, Jiayu Ou, Hong Yang, Zheshun Wu, Xiaojun Qu, Wei Li

被引用 3 次

摘要

D ynamic networks are used in a wide range of fields, including social network analysis, recommender systems, and epidemiology. Representing complex networks as structures changing over time allows network models to leverage not only structural but also temporal patterns. However, as dynamic network literature stems from diverse fields and makes use of inconsistent terminology, it is challenging to navigate. Meanwhile, graph neural networks (GNN) has gained a lot of attention in recent years for their ability to perform well on a range of network science tasks, such as link prediction and node classification. Despite the popularity of graph neural networks and the proven benefits of dynamic network models, there has been little focus on graph neural networks for dynamic networks, namely dynamic graph neural networks (DGNN). However, these models, particularly DGNNs, are rarely compared to each other or existing heuristics. Different works evaluate their models in different ways, thus one cannot compare evaluation metrics and their results directly. With the ubiquity of complex networks, improvements in link predictions have a potential for far-reaching impact. In machine learning, a well-known approach to improve classification performance is to combine several methods in an ensemble. It is established that diversity or complementarity of the combined classifiers, is a prerequisite for a potential performance improvement. In this work, we combine link prediction heuristics (e.g. Common Neighbour, Adamic-Adar, etc.), with static GNNs, discrete DGNNs, and continuous DGNNs. We analyze the importance of the methods and how complementary they are. We also use greedy searches to select better subsets of methods to combine. In our experiments, the greedy search ensemble shows a 55% improvement over the best single method on average. A roll-forward version of the greedy search ensemble further improved the performance with an average increase of 9.6%. This dissertation aims to provide a foundation necessary for improved performance on the dynamic link prediction task. This is done over four core chapters, each associated, or soon to be associated with a publication. The first is a three-part chapter that establishes a foundation for dynamic networks, surveys DGNNs, and outlines dynamic link prediction. The second is an empirical comparison of dynamic link prediction methods, including link prediction heuristics, static GNNs, dynamic DGNNs, and continuous DGNNs. The third introduces an ensemble that combines the methods from Chapter 3 in a heterogeneous ensemble. The fourth extends the heterogeneous ensemble by making it dynamically adaptive, such that it can react to changes in the dynamic network as the data changes. Dissertation supervised by Prof. Katarzyna Musial-Gabrys and Prof. Bogdan Gabrys. First and foremost I'd like to thank my supervisor Professor Katarzyna Musial-Gabrys and my co-supervisor Professor Bogdan Gabrys. I'm grateful for the opportunity to do a PhD, and for the continued support and guidance throughout my study. I have been introduced to the fantastic world of network science and they supported me pursuing the most interesting research topic I could imagine. Their supervision has added great complementary knowledge to this work and their feedback has been invaluable to guide me on this journey. I'd like to thank those that have given advice and helped along my PhD journey. Dr. Yi Zhang has served as the panel expert throughout my PhD, his encouraging feedback and attitude has been very helpful in difficult times. Dr. Jamie Garcia Marin and Dr. Nico Pietroni who served as panel chairs. My co-author and student Matthew Hellmich whose work has added valuable insight to the comparative analysis. Dr. Hongxu Chen, Dr. Radosław Michalski and Dr. Stanisław Saganowski for insightful and impactful conversations. Dr. Chandranath Adak for providing this thesis template. Professor Petter Holme and Dr. Thomas Kipf for their foundational work that inspired much of this thesis. I'd like to thank my friends, colleagues and family who have supported my work in various ways. Leif Olav Skarding and Eirik Skarding for supporting with additional computational power. Rune Holmgren, James Curran, Andreas Løve Selvik and Nisha Sreenivasan for proofreading and feedback. My friends and colleagues Mingshan Jia, Yu-Xuan Qiu, Bin Wang, Xiaolin Zhang and Mohammad Barbar for their knowledge and motivation. Thank you to Tommaso Armstrong for rewarding me with cookies if I stayed productive. Finally, for keeping me sane, I'd like to thank