ASE2025

Fair Developer Score: Build-Adjusted Measurement of Effort and Impact

Xinzhou Wang, Jiancong Zhu, Jinghan Feng, Zixuan Zhang, Joshua Rauvola, Devon Delgado, Ahmad Antar, Abid Ali

摘要

Assessing developer productivity in expansive software endeavors has become a pressing concern for both academia and industry, as organizations seek reliable ways to understand how engineering effort translates into business value. Traditional metrics-such as commit frequency, lines of code, or code churn-have been widely adopted but remain problematic, since they conflate inconsequential edits with architecturally significant reshaping and provide little insight into task-level contributions. To address this limitation, we introduce a commit-centric analytic framework that leverages clustering to reconfigure disbursed commit logs into coherent parcels, termed builds, that align more closely with the functional level of development tasks. Unlike prior approaches that combine heterogeneous signals such as issues, reviews, or communication logs, our method relies solely on the structural and temporal properties of commits, making it lightweight and broadly applicable. Each build is evaluated along two orthogonal axes: developer effort and build importance. Effort operationalizes the scale and character of contributions, considering code proprietorship, scope, architectural centrality, novelty, and cadence. Importance quantifies the build’s systemic consequence, integrating scale of alteration, distribution of changes, architectural centrality, complexity, task priority, and proximity to release milestones. The fusion of these axes produces the Fair Developer Score, a composite benchmark reconciling personal exertion with organizational value. Validation centers on exposure-controlled, matched comparisons that pair FDSranked developers with commit-count peers matched on churn, files changed, and builds participated. On the Linux kernel, FDS-ranked developers exhibit significantly higher Average Importance and Average Effort than volume-matched peers, with lower rework trends. Cross-repository analyses across Kubernetes, TensorFlow, Apache Kafka, and PostgreSQL demonstrate consistent Effort advantages and context-dependent Importance effects, indicating FDS surfaces impactful work beyond raw activity using commit-only data.