ACL2025
Query-Driven Multimodal GraphRAG: Dynamic Local Knowledge Graph Construction for Online Reasoning
Chenyang Bu, Guojie Chang, Zihao Chen, CunYuan Dang, Zhize Wu, Yi He, Xindong Wu
Abstract
An increasing adoption of Large Language Models (LLMs) in complex reasoning tasks necessitates their interpretability and reliability. Recent advances to this end include retrievalaugmented generation (RAG) and knowledge graph-enhanced RAG (GraphRAG), whereas they are constrained by static knowledge bases and ineffective multimodal data integration. In response, we propose a Query-Driven Multimodal GraphRAG framework that dynamically constructs local knowledge graphs tailored to query semantics. Our approach 1) derives graph patterns from query semantics to guide knowledge extraction, 2) employs a multi-path retrieval strategy to pinpoint core knowledge, and 3) supplements missing multimodal information ad hoc. Experimental results on the MultimodalQA and WebQA datasets demonstrate that our framework achieves the state-of-the-art performance among unsupervised competitors, particularly excelling in cross-modal understanding of complex queries. The code is publicly available at https://github.com/DMiC-Lab-HFUT/ Query-Driven-Multimodal-GraphRAG .