EMNLP2023
On Bilingual Lexicon Induction with Large Language Models
Yaoyiran Li, Anna Korhonen, Ivan Vulic
被引用 2 次
摘要
Bilingual Lexicon Induction (BLI) is a core task in multilingual NLP that still, to a large extent, relies on calculating cross-lingual word representations. Inspired by the global paradigm shift in NLP towards Large Language Models (LLMs), we examine the potential of the latest generation of LLMs for the development of bilingual lexicons. We ask the following research question: Is it possible to prompt and fine-tune multilingual LLMs (mLLMs) for BLI, and how does this approach compare against and complement current BLI approaches? To this end, we systematically study 1) zero-shot prompting for unsupervised BLI and 2) fewshot in-context prompting with a set of seed translation pairs, both without any LLM finetuning, as well as 3) standard BLI-oriented finetuning of smaller LLMs. We experiment with 18 open-source text-to-text mLLMs of different sizes (from 0.3B to 13B parameters) on two standard BLI benchmarks covering a range of typologically diverse languages. Our work is the first to demonstrate strong BLI capabilities of text-to-text mLLMs. The results reveal that few-shot prompting with in-context examples from nearest neighbours achieves the best performance, establishing new state-of-the-art BLI scores for many language pairs. We also conduct a series of in-depth analyses and ablation studies, providing more insights on BLI with (m)LLMs, also along with their limitations. Mask-Filling-Style Templates (Zero-Shot Prompting) 1 The word 'w x ' in L y is: <mask>. 2 The word w x in L y is: <mask>. 3 The word 'w x ' in L y is: <mask> 4 The word w x in L y is <mask> 5 The L x word w x in L y is: <mask>. 6 The L x word w x in L y is <mask>. 7 The L x word 'w x ' in L y is: <mask>. 8 The L x word 'w x ' in L y is <mask>. 9 The L x word w x in L y is: <mask> 10 The L x word w x in L y is <mask> 11 The L x word 'w x ' in L y is: <mask> 12 The L x word 'w x ' in L y is <mask> 13 'w x ' in L y is: <mask>. 14 w x in L y is: <mask>. 15 'w x ' in L y is: <mask> 16 w x in L y is: <mask> 17 What is the translation of the word 'w x ' into L y ? <mask>. 18 What is the translation of the word w x into L y ? <mask>. 19 What is the translation of the L x word 'w x ' into L y ? <mask>. 20 What is the translation of the L x word w x into L y ? <mask>. 21 The translation of the word 'w x ' into L y is <mask>. 22 The translation of the word w x into L y is <mask>. 23 The translation of the L x word 'w x ' into L y is <mask>. 24 How do you say 'w x ' in L y ? <mask>.