ACL2020

Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

21 citations

Abstract

Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve the performance on low resource languages. However, the transfer effectiveness on the multilingual Machine Reading Comprehension (MRC) task is substantially poorer than that for sentence classification tasks, mainly due to the requirement of MRC to detect the word level answer boundary. In this paper, we propose two auxiliary tasks to introduce additional phrase boundary supervision in the fine-tuning stage: (1) a mixed MRC task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs; and (2) a language-agnostic knowledge masking task by leveraging knowledge phrases mined from the Web. Extensive experiments on two cross-lingual MRC datasets show the effectiveness of our proposed approach.