ACL2025

DTCRS: Dynamic Tree Construction for Recursive Summarization

Guanran Luo, Zhongquan Jian, Wentao Qiu, Meihong Wang, Qingqiang Wu

被引用 3 次

摘要

Retrieval-Augmented Generation (RAG) mitigates the hallucination issues of large language models (LLMs) by integrating external knowledge. For abstractive questions involving multistep reasoning, knowledge from multiple sections is often required. To address this issue, recent research has introduced recursive summarization, which constructs a hierarchical summary tree by clustering text chunks, integrating information from various parts of the document to provide evidence for abstractive questions. However, summary trees often contain a large number of redundant summary nodes, which not only increase construction time but may also negatively impact question answering. Moreover, recursive summarization is not suitable for all types of questions. We introduce DTCRS, a method that dynamically generates summary trees based on document structure and query semantics. DTCRS determines whether a summary tree is necessary by analyzing the question type. It then decomposes the question and uses the embeddings of subquestions as initial cluster centers, reducing redundant summaries while improving the relevance between summaries and the question. Our approach significantly reduces summary tree construction time and achieves substantial improvements across three QA tasks. Additionally, we investigate the applicability of recursive summarization to different question types, providing valuable insights for future research.