WWW2025
GraphCSR: A Space and Time-Efficient Sparse Matrix Representation for Web-scale Graph Processing
Xinbiao Gan, Tiejun Li, Qiang Zhang, Guang Wu, Bo Yang, Chunye Gong, Jie Liu, Kai Lu
被引用 3 次
摘要
Graph data processing is essential for web-scale applications, including social networks, recommendation systems, and web of things (WoT) systems, where large, sparsely connected graphs dominate. Traditional sparse matrix storage formats like compressed sparse row (CSR) face significant memory and performance bottlenecks in distributed, federated, and edge-based computing environments, which are increasingly central to the web. To address this challenge, we propose GraphCSR, a novel storage format that clusters vertices with identical edge degrees and stores only the starting index of each group. This approach minimizes memory overhead and facilitates batch memory access while enhancing overall performance, making it particularly suitable for federated systems and resource-constrained edge nodes. Our experiments across various graph operations and large datasets show that GraphCSR achieves considerable memory savings and performance gains of large-scale, distributed graph processing. When deployed GraphCSR on two production-grade supercomputers, demonstrating its potential for scaling web and WoT graph processing in large-scale distributed computing systems.