EMNLP2022

Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation

Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Victor Junqiu Wei, Xin Jiang

3 citations

Abstract

Transformer has been demonstrated effective in Neural Machine Translation (NMT). However, it is memory-consuming and time-consuming in edge devices, resulting in some difficulties for real-time feedback. To compress and accelerate Transformer, we propose a Hybrid Tensor-Train (HTT) decomposition, which retains full rank and meanwhile reduces operations and parameters. A Transformer using HTT, named Hypoformer, consistently and notably outperforms the recent lightweight SOTA methods on three standard translation tasks under different parameter and speed scales. In extreme low resource scenarios, Hypoformer has a 7.1 point absolute improvement in BLEU and 1.27× speedup than the vanilla Transformer on the IWSLT'14 De-En task.