BioHiCL: Hierarchical Multi-Label Contrastive Learning for Biomedical Retrieval with MeSH Labels
Mengfei Lan, Lecheng Zheng, Halil Kilicoglu

TL;DR
BioHiCL introduces hierarchical multi-label contrastive learning using MeSH annotations to improve biomedical text retrieval and related tasks.
Contribution
It leverages hierarchical MeSH labels for structured supervision in contrastive learning, enhancing biomedical retrieval performance.
Findings
BioHiCL models outperform existing methods on biomedical retrieval tasks.
Models achieve competitive results on sentence similarity and question answering.
Proposed models are computationally efficient for deployment.
Abstract
Effective biomedical information retrieval requires modeling domain semantics and hierarchical relationships among biomedical texts. Existing biomedical generative retrievers build on coarse binary relevance signals, limiting their ability to capture semantic overlap. We propose BioHiCL (Biomedical Retrieval with Hierarchical Multi-Label Contrastive Learning), which leverages hierarchical MeSH annotations to provide structured supervision for multi-label contrastive learning. Our models, BioHiCL-Base (0.1B) and BioHiCL-Large (0.3B), achieve promising performance on biomedical retrieval, sentence similarity, and question answering tasks, while remaining computationally efficient for deployment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
