ACL2023

Neural Machine Translation for Mathematical Formulae

Felix Petersen, Moritz Schubotz, André Greiner-Petter, Bela Gipp

被引用 6 次

摘要

We tackle the problem of neural machine translation of mathematical formulae between ambiguous presentation languages and unambiguous content languages. Compared to neural machine translation on natural language, mathematical formulae have a much smaller vocabulary and much longer sequences of symbols, while their translation requires extreme precision to satisfy mathematical information needs. In this work, we perform the tasks of translating from L A T E X to Mathematica as well as from L A T E X to semantic L A T E X. While recurrent, recursive, and transformer networks struggle with preserving all contained information, we find that convolutional sequence-to-sequence networks achieve 95.1% and 90.7% exact matches, respectively.