AutoBool: An Reinforcement-Learning trained LLM for Effective Automated Boolean Query Generation for Systematic Reviews
Shuai Wang, Harrisen Scells, Bevan Koopman, Guido Zuccon

TL;DR
AutoBool employs reinforcement learning to train large language models for generating Boolean queries that balance high recall and precision in systematic reviews, outperforming prompt-based methods and matching larger models.
Contribution
This work introduces AutoBool, a reinforcement learning framework that optimizes Boolean query generation without supervised target queries, supported by the largest dataset of its kind.
Findings
AutoBool outperforms zero/few shot prompting methods.
It matches or exceeds larger GPT models in effectiveness.
It retrieves significantly fewer documents while maintaining high recall.
Abstract
We present AutoBool, a reinforcement learning (RL) framework that trains large language models (LLMs) to generate effective Boolean queries for medical systematic reviews. Boolean queries are the primary mechanism for literature retrieval in this domain and must achieve high recall while maintaining reasonable precision - a challenging balance that existing prompt-based LLM approaches often struggle to achieve. A major limitation in this space is the lack of high-quality ground-truth Boolean queries for each topic, which makes supervised fine-tuning impractical. AutoBool addresses this challenge by using RL to directly optimize query generation with retrieval measures, without requiring target queries. To support this effort, we create and release the largest dataset of its kind: 65588 topics in total for training and evaluating the task of automatic Boolean query formulation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Topic Modeling · Artificial Intelligence in Healthcare and Education
