TL;DR
This paper proposes a comprehensive approach to test case prioritization in Continuous Integration environments, utilizing an extensive feature set and machine learning to improve fault detection efficiency in large, real-world software projects.
Contribution
It introduces a detailed data model, a comprehensive feature set, and tools for data collection across multiple open-source projects, addressing limitations of prior studies.
Findings
Effective ML-based TCP models can be developed with comprehensive features.
Data collection time impacts the effectiveness of TCP models.
ML models' performance decays over time, affecting prioritization quality.
Abstract
Continuous Integration (CI) requires efficient regression testing to ensure software quality without significantly delaying its CI builds. This warrants the need for techniques to reduce regression testing time, such as Test Case Prioritization (TCP) techniques that prioritize the execution of test cases to detect faults as early as possible. Many recent TCP studies employ various Machine Learning (ML) techniques to deal with the dynamic and complex nature of CI. However, most of them use a limited number of features for training ML models and evaluate the models on subjects for which the application of TCP makes little practical sense, due to their small regression testing time and low number of failed builds. In this work, we first define, at a conceptual level, a data model that captures data sources and their relations in a typical CI environment. Second, based on this data model,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
