KDD2022

Semantic Aware Answer Sentence Selection Using Self-Learning Based Domain Adaptation

Rajdeep Sarkar, Sourav Dutta, Haytham Assem, Mihael Arcan, John P. McCrae

被引用 4 次

摘要

Selecting an appropriate and relevant context forms an essential component for the efficacy of several information retrieval applications like Question Answering (QA) systems. The problem of Answer Sentence Selection (AS2) refers to the task of selecting sentences, from a larger text, that are relevant and contain the answer to users' queries. While there has been a lot of success in building AS2 systems trained on open-domain data (e.g., SQuAD, NQ), they do not generalize well in closed-domain settings, since domain adaptation can be challenging due to poor availability and annotation expense of domain-specific data. This paper proposes SEDAN, an effective self-learning framework to adapt AS2 models for domain-specific applications. We leverage large pre-trained language models to automatically generate domain-specific QA pairs for domain adaptation. We further fine-tune a pre-trained Sentence-BERT architecture to capture semantic relatedness between questions and answer sentences for AS2. Extensive experiments demonstrate the effectiveness of our proposed approach (over existing state-of-the-art AS2 baselines) on different Question Answering benchmark datasets.