ACL2022

Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory Network

Zheng Gong, Kun Zhou, Wayne Xin Zhao, Jing Sha, Shijin Wang, Ji-Rong Wen

摘要

In this paper, we study how to continually pretrain language models for improving the understanding of math problems. Specifically, we focus on solving a fundamental challenge in modeling math problems, i.e., how to fuse the semantics of textual description and formulas, which are highly different in essence. To address this issue, we propose a new approach called COMUS to continually pre-train language models for math problem understanding with syntax-aware memory network. In this approach, we first construct the math syntax graph to model the structural semantic information, by combining the parsing trees of the text and formulas, and then design the syntax-aware memory networks to deeply fuse the features from the graph and text. With the help of syntax relations, we can model the interaction between the token from the text and its semantic-related nodes within the formulas, which is helpful to capture fine-grained semantic correlations between texts and formulas. Besides, we devise three continual pre-training tasks to further align and fuse the representations of the text and math syntax graph. Experimental results on four tasks in the math domain demonstrate the effectiveness of our approach. Our code and data are publicly available at the link: https: //github.com/RUCAIBox/COMUS . Introduction Understanding math problems via automated methods is a desired machine capacity for artificial intelligence assisted learning. Such a capacity is the key to the success of a variety of educational applications, including math problem retrieval (Reusch et al., 2021 ), problem recommendation (Liu et al., 2018), and problem solving (Huang et al., 2020) . To automatically understand math problems, it is feasible to learn computational representations † † Equal contribution. This work was done when the two author were interns at iFLYTEK Research.