EMNLP2021
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
Zewen Chi, Li Dong, Shuming Ma, Shaohan Huang, Saksham Singhal, Xian-Ling Mao, Heyan Huang, Xia Song, Furu Wei
60 citations
Abstract
Multilingual T5 (MT5; Xue et al. 2020) pretrains a sequence-to-sequence model on massive monolingual texts, which has shown promising results on many cross-lingual tasks. In this paper, we improve multilingual textto-text transfer Transformer with translation pairs (MT6). Specifically, we explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. In addition, we propose a partially nonautoregressive objective for text-to-text pretraining. We evaluate the methods on eight multilingual benchmark datasets, including sentence classification, named entity recognition, question answering, and abstractive summarization. Experimental results show that the proposed MT6 improves cross-lingual transferability over MT5.