ACL2025
Enhanced Data Synthesis for LLM through Reasoning Structures Generated by Hierarchical GFlowNet
Tianpeng Bu, Minying Zhang, Hongtao Duan, Shurui Li, Lulu Hu, Yu Li
Abstract
Large language models (LLMs) excel in problem-solving but require training data with diverse reasoning processes. Existing methods mainly optimize instruction-response pairs but lack a systematic design for the underlying reasoning structure. This paper proposes RSS: a Reasoning Structure driven data Synthesis method. We first proactively develop a hierarchical GFlowNet to construct reasoning structures efficiently through a coarse-to-fine directed acyclic graph (DAG) growth process. Then these reasoning DAGs are leveraged to actively guide the instruction generation via an iterative suggester-editor workflow and enhance response quality using a structure-aware strategy. Experiments show that LLMs trained on our synthetic datasets achieve 48.50%, 84.00%, 79.90% for AlpacaEval2, GSM8K and Hu-manEval, outperforming existing data synthesis methods.