Can Machine Learning Support the Selection of Studies for Systematic Literature Review Updates?
Marcelo Costalonga, Bianca Minetto Napole\~ao, Maria Teresa, Baldassarre, Katia Romero Felizardo, Igor Steinmacher, Marcos Kalinowski

TL;DR
This study evaluates machine learning models for supporting study selection in systematic literature review updates in Software Engineering, finding they can reduce effort but are not yet reliable enough to replace human judgment.
Contribution
It demonstrates that ML models can assist in reducing effort in SLR updates, but still require human expertise for reliable study selection.
Findings
ML models achieved a modest F-score of 0.33.
Models can reduce study selection effort by 33.9%.
Human reviewers' initial screening aligns closely with final results.
Abstract
[Background] Systematic literature reviews (SLRs) are essential for synthesizing evidence in Software Engineering (SE), but keeping them up-to-date requires substantial effort. Study selection, one of the most labor-intensive steps, involves reviewing numerous studies and requires multiple reviewers to minimize bias and avoid loss of evidence. [Objective] This study aims to evaluate if Machine Learning (ML) text classification models can support reviewers in the study selection for SLR updates. [Method] We reproduce the study selection of an SLR update performed by three SE researchers. We trained two supervised ML models (Random Forest and Support Vector Machines) with different configurations using data from the original SLR. We calculated the study selection effectiveness of the ML models for the SLR update in terms of precision, recall, and F-measure. We also compared the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeta-analysis and systematic reviews
