ACL2022

Automatic Song Translation for Tonal Languages

Fenfei Guo, Chen Zhang, Zhirui Zhang, Qixin He, Kejun Zhang, Jun Xie, Jordan L. Boyd-Graber

Abstract

This paper addresses automatic song translation (AST) for tonal languages and the unique challenge of aligning words' tones with melody of a song in addition to conveying the original meaning. We propose three criteria for effective AST-preserving semantics, singability and intelligibility-and develop objectives for these criteria. We develop a new benchmark for English-Mandarin song translation and develop an unsupervised AST system, the Guided AliGnment for Automatic Song Translation (GagaST), which combines pre-training with three decoding constraints. Both automatic and human evaluations show GagaST successfully balances semantics and singability. 1 stresses. Nonetheless, there are cultural and com-039 mercial incentives for more efficient song transla-040 tion; Frozen alone made over a half a billion dollars 041 in non-English box office receipts 2 and Les Mis-042 érables (musical) has been performed in over a 043 dozen languages on stage. 044 As we discuss in Section 2, while translating 045 Western songs resembles poetry translation, trans-046 lating into tonal languages (e.g., Mandarin, Zulu 047 and Vietnamese) brings new problems. In tonal lan-048 guages, a word's pitch contributes to its meaning 049 (Figure 2); when singing in tonal languages, the 050 tones of translated words must align with the "flow" 051 of the pitches in the music (Section 2.1). For exam-052 ple, if "fáng shǒu" were sung instead of "fàng shǒu" 053 (because notes are going up), a listener might hear 054 "defensive" instead of the intended meaning.