ASE2025

Diplomatist: What Do Cross-language Dependencies Reflect Software Ecosystem Health?

Fanyi Meng, Ying Wang, Chun Yong Chong, Hai Yu, Zhiliang Zhu

摘要

In large-scale software development, multilingual projects, those involving multiple interacting programming languages, have become increasingly common in both industry and the open-source community. Research indicates that cross-language dependencies in these projects can increase the like-lihood of risks, such as functionality defects and security vulnerabilities. While most existing studies focus on cross-language dependencies between host languages and specific guest languages (e.g., C/C++), interactions between host languages and a broader range of guest languages, as well as the broader impact of such dependencies on software ecosystems, remain underexplored.To address the above limitations, in this paper, we develop a technique, Diplomatist, to identify and analyze cross-language dependencies between host languages, such as Java, and guest languages, including JavaScript, Python, Ruby, PHP, and C/C++. Diplomatist automatically analyzes cross-language invocation APIs and constructs a large-scale knowledge repository to standardize code features for identifying library versions across various guest languages, enabling host languages to trace the guest language libraries they invoke. Evaluation shows that Diplomatist achieved an average precision of 88.9% and a recall of 91.5% on a high-quality benchmark, indicating its high accuracy in detecting cross-language dependencies. Using Diplomatist, we identified 435,258 Java libraries that indirectly or transitively depend on libraries from other ecosystems. Diplomatist provides a list of cross-language pivotal libraries that contribute to preserving the long-term health and sustainability of software ecosystems. Moreover, we conduct a case study to examine the impact of the risks introduced due to cross-language dependencies on programming language ecosystems, by analyzing a full-picture of the cross-language dependency graph. Our findings show that fragile projects or libraries can propagate security issues across ecosystems via these dependencies, impacting 13,739 downstream projects in the Maven ecosystem. We utilized Diplomatist to provide remediation suggestions to relevant project developers. Issue reports of some subjects have been confirmed by developers.