Reassessing Large Language Model Boolean Query Generation for Systematic Reviews

Shuai Wang; Harrisen Scells; Bevan Koopman; Guido Zuccon

arXiv:2505.07155·cs.IR·June 3, 2025

Reassessing Large Language Model Boolean Query Generation for Systematic Reviews

Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

PDF

Open Access

TL;DR

This paper systematically reproduces and extends prior studies on using Large Language Models to generate Boolean queries for systematic reviews, emphasizing the importance of prompt design and model choice for effective literature retrieval.

Contribution

It addresses overlooked factors in previous work, evaluates multiple LLMs with optimized prompts, and clarifies the impact of model and prompt choices on query quality.

Findings

01

Query effectiveness varies across models and prompts

02

Guided query formulation benefits from well-chosen seed studies

03

Prompt design and model selection are crucial for success

Abstract

Systematic reviews are comprehensive literature reviews that address highly focused research questions and represent the highest form of evidence in medicine. A critical step in this process is the development of complex Boolean queries to retrieve relevant literature. Given the difficulty of manually constructing these queries, recent efforts have explored Large Language Models (LLMs) to assist in their formulation. One of the first studies,Wang et al., investigated ChatGPT for this task, followed by Staudinger et al., which evaluated multiple LLMs in a reproducibility study. However, the latter overlooked several key aspects of the original work, including (i) validation of generated queries, (ii) output formatting constraints, and (iii) selection of examples for chain-of-thought (Guided) prompting. As a result, its findings diverged significantly from the original study. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Meta-analysis and systematic reviews · Topic Modeling