Generative vector search to improve pathology foundation models across multimodal vision-language tasks

Markus Ekvall; Ludvig Bergenstr{\aa}hle; Patrick Truong; Ben Murrell; and Joakim Lundeberg

arXiv:2512.19360·cs.IR·December 23, 2025

Generative vector search to improve pathology foundation models across multimodal vision-language tasks

Markus Ekvall, Ludvig Bergenstr{\aa}hle, Patrick Truong, Ben Murrell, and Joakim Lundeberg

PDF

Open Access

TL;DR

This paper introduces STHLM, a generative vector search method that improves retrieval in complex, high-dimensional biomedical data by sampling query-conditioned embeddings, significantly enhancing performance across various multimodal tasks.

Contribution

STHLM is a novel generative vector search technique that enables wider and more effective retrieval by iterative sampling, outperforming classical methods in biomedical multimodal applications.

Findings

01

Boosts retrieval accuracy by 10-30% across benchmarks

02

Enables up to 10-fold compression of embedding dimensions

03

Improves retrieval in scientific literature, clinical notes, and tissue images

Abstract

Retrieval-augmented generation improves large language models by grounding outputs in external knowledge sources, reducing hallucinations and addressing knowledge cutoffs. However, standard embedding-based retrieval fails to capture the complexity of multi-concept queries, particularly in domains like biomedicine, where biological data are inherently high-dimensional. For example,omics datasets, and clinical reports simultaneously exhibit numerous molecular, cellular, and physiological features. We present Stochastic Latent Matching (STHLM), a generative vector search method that samples query-conditioned embeddings from text or image inputs to enhance retrieval performance. Analogous to how Chain-of-Thought reasoning enables language models to "think longer" on complex problems, STHLM allows retrieval systems to "search wider" through iterative sampling. STHLM demonstrates critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Biomedical Text Mining and Ontologies · Topic Modeling