The Promise and Challenges of Using LLMs to Accelerate the Screening Process of Systematic Reviews
Aleksi Huotala, Miikka Kuutila, Paul Ralph, Mika M\"antyl\"a

TL;DR
This paper investigates the potential of Large Language Models to accelerate systematic review screening by automating tasks and simplifying abstracts, finding that LLMs perform comparably to humans in some scenarios but require further research.
Contribution
The study evaluates LLMs for automating and simplifying systematic review screening, comparing different prompting techniques and assessing their effectiveness against human screeners.
Findings
LLMs like GPT-4 perform comparably to humans in screening tasks.
Few-shot and One-shot prompts outperform Zero-shot prompting.
Text simplification does not significantly improve human screening performance.
Abstract
Systematic review (SR) is a popular research method in software engineering (SE). However, conducting an SR takes an average of 67 weeks. Thus, automating any step of the SR process could reduce the effort associated with SRs. Our objective is to investigate if Large Language Models (LLMs) can accelerate title-abstract screening by simplifying abstracts for human screeners, and automating title-abstract screening. We performed an experiment where humans screened titles and abstracts for 20 papers with both original and simplified abstracts from a prior SR. The experiment with human screeners was reproduced with GPT-3.5 and GPT-4 LLMs to perform the same screening tasks. We also studied if different prompting techniques (Zero-shot (ZS), One-shot (OS), Few-shot (FS), and Few-shot with Chain-of-Thought (FS-CoT)) improve the screening performance of LLMs. Lastly, we studied if redesigning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMeta-analysis and systematic reviews · Computational and Text Analysis Methods · Explainable Artificial Intelligence (XAI)
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · {Dispute@FaQ-s}How to file a dispute with Expedia? · Dense Connections · Label Smoothing · Residual Connection
