ACL2021

Entity-Aware Abstractive Multi-Document Summarization

Hao Zhou, Weidong Ren, Gongshen Liu, Bo Su, Wei Lu

Abstract

Entities and their mentions convey significant semantic information in documents. In multidocument summarization, the same entity may appear across different documents. Capturing such cross-document entity information can be beneficial -intuitively, it allows the system to aggregate diverse useful information around the same entity for better summarization. In this paper, we present EMSum, an entityaware model for abstractive multi-document summarization. Our model augments the classical Transformer-based encoder-decoder framework with a heterogeneous graph consisting of text units and entities as nodes, which allows rich cross-document information to be captured. In the decoding process, we design a novel two-level attention mechanism, allowing the model to deal with saliency and redundancy issues explicitly. Our model can also be used together with pre-trained language models, arriving at improved performance. We conduct comprehensive experiments on the standard datasets and the results show the effectiveness of our approach.