EMNLP2022

Bilingual Synchronization: Restoring Translational Relationships with Editing Operations

Jitao Xu, Josep Maria Crego, François Yvon

5 citations

Abstract

Machine Translation (MT) is usually viewed as a one-shot process that generates the target language equivalent of some source text from scratch. We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. For this bilingual synchronization task, we consider several architectures (both autoregressive and non-autoregressive) and training regimes, and experiment with multiple practical settings such as simulated interactive MT, translating with Translation Memory (TM) and TM cleaning. Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks. WikiAtomicEdits Model Substitution Deletion 2 Deletion 1 n' arrivera . n' arrivera . Cela n' arrivera pas . That 's not going to happen . Cela n' arrivera pas . Cela n' arrivera pas . Cela n' arrivera pas , mais seulement . Cela ne n' y arrivera donc pas . 2 . Cela ne se produira pas . That will not happen . Cela n' arrivera pas . That 's not going to happen .