EMNLP2022

Summarizing Community-based Question-Answer Pairs

Ting-Yao Hsu, Yoshi Suhara, Xiaolan Wang

被引用 5 次

摘要

Community-based Question Answering (CQA), which allows users to acquire their desired information, has increasingly become an essential component of online services in various domains such as E-commerce, travel, and dining. However, an overwhelming number of CQA pairs makes it difficult for users without particular intent to find useful information spread over CQA pairs. To help users quickly digest the key information, we propose the novel CQA summarization task that aims to create a concise summary from CQA pairs. To this end, we first design a multi-stage data annotation process and create a benchmark dataset, CO-QASUM, based on the Amazon QA corpus. We then compare a collection of extractive and abstractive summarization methods and establish a strong baseline approach DedupLED for the CQA summarization task. Our experiment further confirms two key challenges, sentencetype transfer and deduplication removal, towards the CQA summarization task. Our data and code are publicly available. 1 * Work done while at Megagon Labs. 1 https://github.com/megagonlabs/ qa-summarization Q: Is this actually a rigid board or more of a floppy mat? A: It is rigid.the main board is rigid,the two sides are semi. Q: Is this actually a rigid board or more of a floppy mat? A: The main area is very sturdy. Then there are two work area pads that are more flexible so when moving those I keep two hands on them. Q: how wide is each Side piece?" A: 16 inches wide (there are two).