A Process Mining-Based System For The Analysis and Prediction of Software Development Workflows
Ant\'ia Dorado, Iv\'an Folgueira, Sof\'ia Mart\'in, Gonzalo Mart\'in, \'Alvaro Porto, Alejandro Ramos, John Wallace

TL;DR
This paper introduces CodeSight, a process mining and machine learning system that analyzes GitHub data to predict software development deadline compliance, aiding proactive project management.
Contribution
It presents an end-to-end system combining process mining logs with LSTM models for accurate deadline prediction in software workflows.
Findings
High precision and F1 scores in deadline prediction
Effective integration of process mining with machine learning
Actionable insights for workflow efficiency
Abstract
CodeSight is an end-to-end system designed to anticipate deadline compliance in software development workflows. It captures development and deployment data directly from GitHub, transforming it into process mining logs for detailed analysis. From these logs, the system generates metrics and dashboards that provide actionable insights into PR activity patterns and workflow efficiency. Building on this structured representation, CodeSight employs an LSTM model that predicts remaining PR resolution times based on sequential activity traces and static features, enabling early identification of potential deadline breaches. In tests, the system demonstrates high precision and F1 scores in predicting deadline compliance, illustrating the value of integrating process mining with machine learning for proactive software project management.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
