ACL2024

Dictionary-Aided Translation for Handling Multi-Word Expressions in Low-Resource Languages

Antonios Dimakis, Stella Markantonatou, Antonios Anastasopoulos

Abstract

Multi-word expressions (MWEs) present unique challenges in natural language processing (NLP), particularly within the context of translation systems, due to their inherent scarcity, non-compositional nature, and other distinct lexical and morphosyntactic characteristics, issues that are exacerbated in lowresource settings. In this study, we elucidate and attempt to address these challenges by leveraging a substantial corpus of humanannotated Greek MWEs. To address the complexity of translating such phrases, we propose a novel method leveraging an available outof-context lexicon. We assess the translation capabilities of current state-of-the-art systems on this task, employing both automated metrics and human evaluators. We find that by using our method when applicable, the performance of current systems can be significantly improved. However, these models are still unable to produce translations comparable to those of a human speaker.