VLDB2025

Graph Transformers for Query Plan Representation: Potentials and Challenges

Chenghao Lyu, Guillaume Lachaud, Gabriel Lozano, Yanlei Diao

摘要

Query Plan Representation (QPR) is central to workload modeling, with various deep-learning based architectures proposed in the literature. Our work is motivated by two key observations: (i) the research community still lacks clarity on which model, if any, best suits the QPR problem; and (ii) while transformers have revolutionized many fields, their potential for QPR remains largely underexplored. This study examines the strengths and challenges of Graph Transformers for QPR. We introduce a new taxonomy that unifies deep-learning based QPR techniques along key design axes. Our benchmark analysis of common QPR architectures reveals that Graph Transformer Networks (GTNs) consistently outperform alternatives, but can degrade under limited training data. To address this, we propose novel data augmentation techniques to enhance training diversity and refine GTN architectures by replacing ineffective language-model-inspired components with techniques better suited for query plans. Evaluation on JOB, TPC-H, and TPC-DS benchmarks shows that with sufficient training data, enhanced GTNs outperform existing models for capturing complex queries (JOB Full and TPC-DS) and enable the query embedder trained on TPC-DS to generalize to TPC-H queries out of the box.