Predicting Coding Effort in Projects Containing XML Code
Siim Karus, Marlon Dumas

TL;DR
This study uses machine learning to predict one-year coding effort in XML-containing projects, finding developer expertise is a key factor and cross-project models are unreliable.
Contribution
It introduces a predictive approach focusing on XML projects and highlights the limited transferability of models across different projects.
Findings
Developer expertise strongly influences coding effort.
Source code metrics have minimal impact on prediction accuracy.
Models trained on one project do not generalize well to others.
Abstract
This paper studies the problem of predicting the coding effort for a subsequent year of development by analysing metrics extracted from project repositories, with an emphasis on projects containing XML code. The study considers thirteen open source projects and applies machine learning algorithms to generate models to predict one-year coding effort, measured in terms of lines of code added, modified and deleted. Both organisational and code metrics associated to revisions are taken into account. The results show that coding effort is highly determined by the expertise of developers while source code metrics have little effect on improving the accuracy of estimations of coding effort. The study also shows that models trained on one project are unreliable at estimating effort in other projects.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Manufacturing Process and Optimization
