ICML2024
Position: Relational Deep Learning - Graph Representation Learning on Relational Databases
Matthias Fey, Weihua Hu, Kexin Huang, Jan Eric Lenssen, Rishabh Ranjan, Joshua Robinson, Rex Ying, Jiaxuan You, Jure Leskovec
41 citations
Abstract
Much of the world's most valued data is stored in data warehouses, where the data is spread across many tables connected by primary-foreign key relations. However, building machine learning models using this data is both challenging and time consuming. The core problem is that no machine learning method is capable of learning directly on the data spread across multiple relational tables. Current methods can only learn from a single table, so the data must first be joined and aggregated into a single training table, the process known as feature engineering. Here we introduce an end-to-end deep representation learning approach to directly learn on data spread across multiple tables. We name our approach Relational Deep Learning. The core idea is to view relational tables as a heterogeneous graph, with a node for each row in each table, and edges specified by primary-foreign key relations. Message Passing Neural Networks can then automatically learn across multiple tables to extract representations that leverage all input data, without any manual feature engineering. To facilitate research, we also develop RELBENCH, a set of benchmark datasets and an implementation of Relational Deep Learning. The data covers a wide spectrum, from discussions on Stack Exchange to book reviews on the Amazon Product Catalog. Overall, we define a new research area that generalizes graph machine learning and broadens its applicability to a wide set of AI use cases.