Multi-Programming-Language Commits in OSS: An Empirical Study on Apache Projects
Zengyang Li, Xiaoxiao Qi, Qinyi Yu, Peng Liang, Ran Mo, Chen Yang

TL;DR
This empirical study analyzes multi-programming-language commits in Apache projects, revealing their prevalence, complexity, and impact on issue resolution time and bug proneness, offering insights for software quality management.
Contribution
It is the first comprehensive empirical analysis of MPLCs in open-source projects, highlighting their characteristics and effects on development and quality.
Findings
9% of commits are MPLCs, stable in 80% of projects
Over 90% of MPLCs involve two programming languages
MPLCs have higher change complexity and bug proneness
Abstract
Modern software systems, such as Spark, are usually written in multiple programming languages (PLs). Besides benefiting from code reuse, such systems can also take advantages of specific PLs to implement certain features, to meet various quality needs, and to improve development efficiency. In this context, a change to such systems may need to modify source files written in different PLs. We define a multi-programming-language commit (MPLC) in a version control system (e.g., Git) as a commit that involves modified source files written in two or more PLs. To our knowledge, the phenomenon of MPLCs in software development has not been explored yet. In light of the potential impact of MPLCs on development difficulty and software quality, we performed an empirical study to understand the state of MPLCs, their change complexity, as well as their impact on open time of issues and bug proneness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software System Performance and Reliability
