WWW2026

Missingness-aware Federated Contrastive Learning on Semantic Graphs

Shuo Yu, Zhuoyang Han, Guoqing Han, Tao Tang, Feng Ding, Qiang Zhang

Abstract

Semantic graphs are fundamental to the Web, enabling applications such as semantic search, recommendation, and knowledge-intensive reasoning. In decentralized Web environments, however, these graphs are distributed across organizations and constrained by strict privacy policies, making centralized training infeasible. Federated learning provides a promising solution, yet its effectiveness is severely limited by the dual incompleteness of real-world semantic graphs: missing node attributes and incomplete relational structures. Such dual missingness, often heterogeneous and unobserved across clients, causes substantial degradation in model performance. We present FedCL, a missingness-aware federated contrastive learning framework for dual-incomplete semantic graphs. FedCL introduces two key components: a topology estimation module, grounded in rate–distortion theory, that privately quantifies structural incompleteness across clients, and a federated reconstruction module that leverages these estimations to generate plausible relations without inferring sensitive attributes. To further improve robustness, FedCL integrates graph contrastive learning across reconstructed subgraphs, ensuring semantic consistency across heterogeneous and incomplete client graphs. Experiments on benchmark citation and Web datasets demonstrate that FedCL consistently outperforms state-of-the-art baselines in accuracy and robustness under heterogeneous missingness, while preserving strong privacy guarantees. These results highlight FedCL as a scalable and trustworthy approach for federated learning on incomplete semantic graphs, advancing privacy-preserving knowledge sharing on the Web.