ACL2021

Keep the Primary, Rewrite the Secondary: A Two-Stage Approach for Paraphrase Generation

Yixuan Su, David Vandyke, Simon Baker, Yan Wang, Nigel Collier

Abstract

Paraphrase generation is an important and challenging NLG problem. In this work, we propose a new Identification-then-Aggregation (IA) framework to tackle this task. In the identification step, the input tokens are sorted into two groups by a novel Primary/Secondary Identification (PSI) algorithm. In the aggregation step, these groups are separately encoded, before being aggregated by a custom designed decoder, which autoregressively generates the paraphrased sentence. In extensive experiments on two benchmark datasets, we demonstrate that our model outperforms previous studies by a notable margin. We also show that the proposed approach can generate paraphrases in an interpretable and controllable way.