ACL2021

Highlight-Transformer: Leveraging Key Phrase Aware Attention to Improve Abstractive Multi-Document Summarization

Shuaiqi Liu, Jiannong Cao, Ruosong Yang, Zhiyuan Wen

摘要

Abstractive multi-document summarization aims to generate a comprehensive summary covering salient content from multiple input documents. Compared with previous RNNbased models, the Transformer-based models employ the self-attention mechanism to capture the dependencies in input documents and can generate better summaries. Existing works have not considered key phrases in determining attention weights of self-attention. Consequently, some of the tokens within key phrases only receive small attention weights. It can affect completely encoding key phrases that convey the salient ideas of input documents. In this paper, we introduce the Highlight-Transformer, a model with the highlighting mechanism in the encoder to assign greater attention weights for the tokens within key phrases. We propose two structures of highlighting attention for each head and the multihead highlighting attention. The experimental results on the Multi-News dataset show that our proposed model significantly outperforms the competitive baseline models.