Scalable High-Recall Constraint-Satisfaction-Based Information Retrieval for Clinical Trials Matching
Cyrus Zhou, Yufei Jin, Yilin Xu, Yu-Chiang Wang, Chieh-Ju Chao, Monica S. Lam

TL;DR
SatIR is a scalable, interpretable clinical trial retrieval method using formal constraint satisfaction techniques, significantly improving recall and relevance over existing approaches.
Contribution
Introduces SatIR, a novel formal methods-based approach leveraging SMT, relational algebra, and LLMs for high-recall, precise, and interpretable clinical trial matching.
Findings
Retrieves 32%-72% more relevant trials per patient.
Improves recall over existing methods by 22-38 points.
Per-patient retrieval time is under 3 seconds.
Abstract
Clinical trials are central to evidence-based medicine, yet many struggle to meet enrollment targets, despite the availability of over half a million trials listed on ClinicalTrials.gov, which attracts approximately two million users monthly. Existing retrieval techniques, largely based on keyword and embedding-similarity matching between patient profiles and eligibility criteria, often struggle with low recall, low precision, and limited interpretability due to complex constraints. We propose SatIR, a scalable clinical trial retrieval method based on constraint satisfaction, enabling high-precision and interpretable matching of patients to relevant trials. Our approach uses formal methods -- Satisfiability Modulo Theories (SMT) and relational algebra -- to efficiently represent and match key constraints from clinical trials and patient records. Beyond leveraging established medical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
