ACL2021

PRAL: A Tailored Pre-Training Model for Task-Oriented Dialog Generation

Jing Gu, Qingyang Wu, Chongruo Wu, Weiyan Shi, Zhou Yu

摘要

Large pre-trained language generation models such as GPT-2 have demonstrated their effectiveness as language priors by reaching stateof-the-art results in various language generation tasks. However, the performance of pretrained models on task-oriented dialog tasks is still under-explored. We propose a Pre-trained Role Alternating Language model (PRAL), explicitly designed for task-oriented conversational systems. We design several techniques: start position randomization, knowledge distillation, and history discount to improve pretraining performance. In addition, we introduce a high quality large-scale task-oriented dialog pre-training dataset. We effectively adapt PRAL on three downstream tasks. With much less training data, PRAL outperforms or is on par with state-of-the-art models.