Bellwethers: A Baseline Method For Transfer Learning
Rahul Krishna, Tim Menzies

TL;DR
This paper introduces 'bellwethers', a simple transfer learning baseline in software analytics that identifies a single project whose data provides stable and effective predictions across multiple projects, reducing conclusion instability.
Contribution
The paper proposes a straightforward method to identify a bellwether project for transfer learning, demonstrating its effectiveness across various software engineering tasks and establishing it as a baseline for future research.
Findings
Bellwethers provide stable predictions across projects.
Bellwether-based transfer learning performs comparably to advanced methods.
The approach is simple and applicable to multiple SE tasks.
Abstract
Software analytics builds quality prediction models for software projects. Experience shows that (a) the more projects studied, the more varied are the conclusions; and (b) project managers lose faith in the results of software analytics if those results keep changing. To reduce this conclusion instability, we propose the use of "bellwethers": given N projects from a community the bellwether is the project whose data yields the best predictions on all others. The bellwethers offer a way to mitigate conclusion instability because conclusions about a community are stable as long as this bellwether continues as the best oracle. Bellwethers are also simple to discover (just wrap a for-loop around standard data miners). When compared to other transfer learning methods (TCA+, transfer Naive Bayes, value cognitive boosting), using just the bellwether data to construct a simple transfer learner…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software System Performance and Reliability
