WWW2026

LHG: LLM-enhanced and Heterogeneous Graph-induced for Unsupervised Social Event Detection

Zitai Qiu, Rongwei Xu, Congbo Ma, Shan Xue, Jian Yang, Guanfeng Liu, Quan Z. Sheng, Amin Beheshti, Jia Wu

摘要

Social event detection (SED) aims to detect events (news) on social media platforms, which is essential in various applications, such as public opinion surveillance, disaster control, and market monitoring. However, social media platforms are characterized by the generation of short, dynamic, and multi-source social messages. This makes annotating social messages time-consuming and difficult (label scarcity), and poses a challenge to the widespread application of SED. Despite efforts, existing unsupervised SED models rely on graph structures to address the lack of textual content, resulting in unstable performance in dynamic social messages. To solve the above challenges, this work proposes an unsupervised SED framework with an LLM enhancement and Heterogeneous Graph induction (LHG). Specifically, to address the label scarcity problem, LHG generates pseudo-labels for initial social messages through an LLM. Considering the unreliable nature of LLM-generated labels, LHG designed a Meta-Path Guided Label Similarity Selector (MPLSS). In detail, MPLSS in LHG calculates the similarity of these pseudo-labels and constructs the initial social messages corresponding to the pseudo-labels into triplets based on the meta-paths in the heterogeneous information graph (HIG), thereby mitigating problems caused by LLMs, such as hallucination. Afterward, to improve stability, LHG not only utilizes the structural information in HIG via MPLSS, but also reduces the embedding distortion of the hierarchical structure in HIG and sentences via a hyperbolic representation, thereby ensuring that there is sufficient available information in dynamic social messages. Extensive experiments show that LHG achieves state-of-the-art (SOTA) results on two widely used real-world datasets.