Enhancing Study-Level Inference from Clinical Trial Papers via Reinforcement Learning-Based Numeric Reasoning
Massimiliano Pronesti, Michela Lorandi, Paul Flanagan, Oisin Redmond, Anya Belz, Yufang Hou

TL;DR
This paper introduces a reinforcement learning-based numeric reasoning system to improve the extraction of quantitative evidence from clinical trial papers, leading to more accurate and interpretable study-level conclusions in systematic reviews.
Contribution
It presents a novel reinforcement learning approach for numeric data extraction and reasoning, outperforming retrieval-based systems and large language models on relevant benchmarks.
Findings
Up to 21% absolute F1 score improvement over retrieval systems
Outperforms large language models by up to 9% on RCTs benchmark
Reinforcement learning enhances numeric reasoning accuracy
Abstract
Systematic reviews in medicine play a critical role in evidence-based decision-making by aggregating findings from multiple studies. A central bottleneck in automating this process is extracting numeric evidence and determining study-level conclusions for specific outcomes and comparisons. Prior work has framed this problem as a textual inference task by retrieving relevant content fragments and inferring conclusions from them. However, such approaches often rely on shallow textual cues and fail to capture the underlying numeric reasoning behind expert assessments. In this work, we conceptualise the problem as one of quantitative reasoning. Rather than inferring conclusions from surface text, we extract structured numerical evidence (e.g., event counts or standard deviations) and apply domain knowledge informed logic to derive outcome-specific conclusions. We develop a numeric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStatistical Methods in Clinical Trials
