WWW2026

A Simplex Approach to Synthetic Knowledge Graph Generation

Ana Alexandra Morim da Silva, Atul Bhopalsing Pundir, Michael Röder, Axel-Cyrille Ngonga Ngomo

摘要

The growing scale of knowledge graphs demands scalable systems for their subsequent processing. However, accurate benchmarking requires large knowledge graphs. While data-driven synthetic generators based on versioned datasets are promising to generate large realistic graphs, current approaches generate the graph at a triple level without considering higher-order structures. This work introduces SimplexKG, a simplex-based synthetic knowledge graph generator. Our approach analyzes d-dimensional simplices within input knowledge graphs and leverages the identified simplicial networks to generate a synthetic graph of arbitrary size. We explore whether leveraging higher-dimensional structures enhances the realism of synthetic graphs by evaluating the structure and the utility of the generated graphs. Our approach consistently outperforms 2 baseline generators and 6 variants of the state-of-the-art generator LEMMING in structural fidelity and triple store benchmarking scenarios across 3 datasets. Specifically, compared to the second-best approach, our graphs achieve a structuredness value up to 26.62 % closer to the target graph, while reducing the query throughput error by up to 6.59 % across storage solutions.