ACL2023
G³R: A Graph-Guided Generate-and-Rerank Framework for Complex and Cross-domain Text-to-SQL Generation
Yanzheng Xiang, Qian-Wen Zhang, Xu Zhang, Zejie Liu, Yunbo Cao, Deyu Zhou
3 citations
Abstract
We present a framework called G 3 R for complex and cross-domain Text-to-SQL generation. G 3 R aims to address two limitations of current approaches: (1) The structure of the abstract syntax tree (AST) is not fully explored during the decoding process which is crucial for complex SQL generation; (2) Domain knowledge is not incorporated to enhance their ability to generalise to unseen domains. G 3 R consists of a graph-guided SQL generator and a knowledge-enhanced re-ranking mechanism. Firstly, during the decoding process, an AST-Grammar bipartite graph is constructed for joint modelling the AST and corresponding grammar rules of the generated partial SQL query. The graph-guided SQL generator captures its structural information and fuses heterogeneous information to predict the action sequence, which can uniquely construct the AST for the corresponding SQL query. Then, in the inference stage, a knowledge-enhanced re-ranking mechanism is proposed to introduce domain knowledge to re-rank candidate SQL queries from the beam output and choose the final answer. The SQL re-ranker is based on a pre-trained language model (PLM) and contrastive learning with hybrid prompt tuning is incorporated to stimulate the knowledge of the PLM and make it more discriminative. The proposed approach achieves state-of-theart results on the Spider and Spider-DK benchmarks, which are challenging complex and cross-domain benchmarks for Text-to-SQL semantic analysis.