WWW2026

How Human Experts Educate Specialized LLMs: Filling Knowledge Gaps in KG-Augmented Generation through Hallucination Detection

Chaofan Li, Lixing Chen, Junhua Tang, Yang Bai, Yutong Zhang, Zhi Zheng, Pan Zhou, Zhe Qu

摘要

The integration of Domain Knowledge Graphs (DKG) into Retrieval-Augmented Generation has emerged as a promising approach for constructing Specialized Large Language Models (spLLMs).On account of the scarcity of high-quality DKGs, existing approaches employ an evolutionary framework, wherein the DKG is continuously evolved alongside its utilization for enhancing the LLM. Yet, these methods face two key limitations: 1) heavy reliance on costly expert knowledge, and 2) neglect of the connection between the LLM's inherent knowledge and external expert knowledge. To address these issues, this paper introduces Epistemic Cognition-enhanced Specialized LLMs (EC-spLLM), a novel evolutionary framework that instills epistemic cognition into the LLM to systematically exploit both its internal knowledge and expert knowledge. At the core of EC-spLLM lies the Hallucination Detection-based Epistemic Cognition (HDEC) mechanism, which assesses the reliability of LLM-generated responses using the LLM's self cognition and hallucination detection. This assessment ability enables EC-spLLM to selectively adopt either the expert-provided golden answer or a reliable LLM-generated answer during DKG evolution, thereby reducing dependence on experts and bridging internal and external knowledge sources to enhance performance. We conducted extensive experiments on five datasets spanning five domain, e.g., emotional sociology, biology, ect. Results show that EC-spLLM reduces the usage of golden answers by an average of 67% while retaining 97.2% of the accuracy achieved by the SOTA method, and outperforms all other baselines.