WWW2026
Towards Robust Detection of Chinese Toxic Variants via Dynamic Knowledge Graph-LLM Reasoning
Shaochen Yang, Kefei Zhou, Wei Xu
Abstract
With the growing importance of content safety, toxic language detection, especially in Chinese online environments, has become a key task in natural language processing. However, real-world toxic expressions often appear in obfuscated forms such as pinyin abbreviations, symbol insertion, or visually similar substitutions, making them difficult to detect using traditional rule-based or static models. To address this challenge, we propose a dynamic knowledge graph construction method for toxic text variants, named Variant-KG. This graph encodes diverse structural relations between canonical toxic terms and their variants based on phonetic similarity, visual resemblance, and contextual co-occurrence. A small amount of labeled data is further used to fine-tune large language models (LLMs), enabling initial normalization and variant recognition. On top of this, we design a collaborative detection framework that combines the Variant-KG with frozen LLMs. It performs graph augmented prompting for structure-aware reasoning and adopts a Think-Search-Generate paradigm to dynamically recover broken paths when graph connections are incomplete, enabling both data self-enhancement and knowledge completion during inference. Evaluations on multiple Chinese toxic language datasets show that our model consistently outperforms both non-knowledge-enhanced and existing knowledge-enhanced baselines, demonstrating the effectiveness of our proposed dynamic reasoning framework in handling diverse toxic expressions.