CliniQ: A Multi-faceted Benchmark for Electronic Health Record Retrieval with Semantic Match Assessment
Zhengyun Zhao, Hongyi Yuan, Jingjing Liu, Haichao Chen, Huaiyuan Ying,, Songchi Zhou, Yue Zhong, Sheng Yu

TL;DR
CliniQ is a new public benchmark for EHR retrieval that evaluates different retrieval methods, including semantic matching, across single and multi-patient scenarios, aiming to advance clinical information retrieval research.
Contribution
We introduce CliniQ, the first comprehensive EHR retrieval benchmark with semantic match assessment, covering diverse retrieval settings and providing detailed relevance judgments and analysis.
Findings
BM25 performs strongly as a baseline.
Dense retrievers sometimes outperform domain-specific models.
Semantic matching reveals strengths and weaknesses of retrieval methods.
Abstract
Electronic Health Record (EHR) retrieval plays a pivotal role in various clinical tasks, but its development has been severely impeded by the lack of publicly available benchmarks. In this paper, we introduce a novel public EHR retrieval benchmark, CliniQ, to address this gap. We consider two retrieval settings: Single-Patient Retrieval and Multi-Patient Retrieval, reflecting various real-world scenarios. Single-Patient Retrieval focuses on finding relevant parts within a patient note, while Multi-Patient Retrieval involves retrieving EHRs from multiple patients. We build our benchmark upon 1,000 discharge summary notes along with the ICD codes and prescription labels from MIMIC-III, and collect 1,246 unique queries with 77,206 relevance judgments by further leveraging powerful LLMs as annotators. Additionally, we include a novel assessment of the semantic gap issue in EHR retrieval by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Semantic Web and Ontologies
