Negation is Not Semantic: Diagnosing Dense Retrieval Failure Modes for Trade-offs in Contradiction-Aware Biomedical QA
Soumya Ranjan Sahoo, Gagan N., Sanand Sasidharan, Divya Bharti

TL;DR
This paper investigates dense retrieval failure modes in biomedical question answering, introduces a novel architecture balancing support and contradiction detection, and enhances answer reliability and citation coverage in LLM-based systems.
Contribution
It identifies key failure modes like Semantic Collapse and Retrieval Asymmetry, and proposes a Decoupled Lexical Architecture with improved retrieval and answer generation methods.
Findings
Decoupled Lexical Architecture balances support recall and contradiction detection.
Achieved highest Weighted MRR (0.790) on the proxy benchmark.
System ranked 2nd on contradiction F1 and 3rd on citation coverage in TREC results.
Abstract
Large Language Models (LLMs) have demonstrated strong capabilities in biomedical question answering, yet their tendency to generate plausible but unverified claims poses serious risks in clinical settings. To mitigate these risks, the TREC 2025 BioGen track mandates grounded answers that explicitly surface contradictory evidence (Task A) and the generation of narrative driven, fully attributed responses (Task B). Addressing the absence of target ground truth, we present a proxy-based development framework using the SciFact dataset to systematically optimize retrieval architectures. Our iterative evaluation revealed a "Simplicity Paradox": complex adversarial dense retrieval strategies failed catastrophically at contradiction detection (MRR 0.023) due to Semantic Collapse, where negation signals become indistinguishable in vector space. We further identify a Retrieval Asymmetry:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Biomedical Text Mining and Ontologies
