WWW2026
Conflict-Aware RAG: Multi-Stage Learning with Conflict Signals for Robust Retrieval-Augmented Generation
Haiyan Wu, Chenchen Wang, Chaoqun Sun, Chengxiong Lu, Zhiqiang Zhang, Yanhong Chen
摘要
Retrieval-Augmented Generation (RAG) effectively mitigates hallucinations and knowledge gaps in Large Language Models (LLMs) for knowledge-intensive tasks by incorporating external web-based knowledge. However, when integrating diverse yet potentially conflicting web-sourced information, RAG systems are prone to knowledge conflicts that manifest as incorrect or inconsistent model behaviors, ultimately leading to unreliable responses. To address this challenge, this paper proposes Conflict-Aware RAG, a general training framework that leverages the model's inherent conflict-sensing capability to build a more robust RAG system via phased optimization. At the core of this framework lies ConScore, a conflict signal that quantifies the model's awareness of potential knowledge conflicts by comparing generative probabilities across distinct knowledge sources. This signal then guides both the construction of training data and a multi-stage optimization workflow: In the Supervised Fine-Tuning (SFT) stage, conflict features are employed to select representative distracting documents, laying the groundwork for core RAG capabilities; in the Direct Preference Optimization (DPO) stage, high-quality preference pairs are constructed using the conflict signal to boost the model's robustness against distracting knowledge; and in the Reranking stage, conflict confidence and information gain are integrated to synergistically optimize the collaboration mechanism between the retriever and LLM. Experiments on six knowledge-intensive question answering (QA) datasets demonstrate that Conflict-Aware RAG significantly outperforms mainstream baselines. Further ablation studies and quantitative analyses validate the method's stability and generalization, laying the foundation for robust RAG systems.