ICLR2021
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir R. Radev, Richard Socher, Caiming Xiong
59 citations
Abstract
We present GRAPPA, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data. We construct synthetic question-SQL pairs over high-quality tables via a synchronous context-free grammar (SCFG). We pre-train GRAPPA on the synthetic data to inject important structural properties commonly found in table semantic parsing into the pre-training language model. To maintain the model's ability to represent real-world data, we also include masked language modeling (MLM) on several existing table-and-language datasets to regularize our pre-training process. Our proposed pre-training strategy is much data-efficient. When incorporated with strong base semantic parsers, GRAPPA achieves new state-of-the-art results on four popular fully supervised and weakly supervised table semantic parsing tasks. The pre-trained embeddings can be downloaded at https://huggingface.co/Salesforce/grappa_large_jnt .