WWW2026
OpenDigger: A Practical Framework for Assessing Community Health and Sustainability in Open Source Collaboration Platforms
Wei Wang, Fanyu Han, Shengyu Zhao, Xuan Zhou, Weining Qian, Aoying Zhou, Xiaoya Xia, Liyun Yang, Rong Wang, Ning Jiang, Moming Duan
Abstract
The rapid development and widespread adoption of open source software, facilitated and accelerated by the web, have fostered a vibrant ecosystem for collaborative development and innovation. GitHub, a leading platform for collaborative software development, currently hosts more than 100 million registered users, creating a substantial ecosystem for examining open source community behaviors. Existing tools for measuring open source communities primarily focus on metrics such as issue response time, pull request response time, or incremental stars to provide insights into community activity. However, these tools are limited in their ability to assess the influence of communities from the perspective of collaboration networks. Moreover, current data collection solutions offer fixed functionalities and lack the flexibility to support multi-source, fine-grained, and customizable data acquisition, which is essential for comprehensive analysis of Open Source Ecosystems (OSEs). In this paper, we present OpenDigger, a framework for multi-dimensional assessment of collaboration activities in OSEs. To enable scalable, modular, and continuous acquisition of OSE data, we developed OpenCrawler, a one-line service providing customizable, fine-grained control over data collection. Using the collected data, OpenDigger computes 20 statistical and 2 network-based metrics, and our empirical analysis further verifies their effectiveness in enabling a comprehensive assessment of trends in OSEs. By continuously collecting logs from GitHub and Gitee, OpenDigger has now accumulated over 9 billion records. Our framework has already been deployed across multiple industrial environments, including Alibaba Group, Ant Group, Apache Foundation, and Mulan Open Source Community.