ACL2024
LANS: A Layout-Aware Neural Solver for Plane Geometry Problem
Zhongzhi Li, Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu
3 citations
Abstract
Geometry problem solving (GPS) is a challenging mathematical reasoning task requiring multi-modal understanding, fusion, and reasoning. Existing neural solvers take GPS as a vision-language task but are short in the representation of geometry diagrams that carry rich and complex layout information. In this paper, we propose a layout-aware neural solver named LANS, integrated with two new modules: multimodal layout-aware pre-trained language module (MLA-PLM) and layout-aware fusion attention (LA-FA). MLA-PLM adopts structural-semantic pre-training (SSP) to implement global relationship modeling, and pointmatch pre-training (PMP) to achieve alignment between visual points and textual points. LA-FA employs a layout-aware attention mask to realize point-guided cross-modal fusion for further boosting layout awareness of LANS. Extensive experiments on datasets Geometry3K and PGPS9K validate the effectiveness of th layout-aware modules and superior problemsolving performance of our LANS solver, over existing symbolic and neural solvers. We have made our code and data publicly available. 1