Validation of Whole-Slide Foundation Models for Image Retrieval in TCGA Data

Tianhao Lei; Parsa Esmaeilkhani; Saghir Alfasly; Wataru Uegami; Judy C. Boughey; Matthew P. Goetz; Krishna R. Kalari; H.R. Tizhoosh

arXiv:2605.00902·cs.CV·May 5, 2026

Validation of Whole-Slide Foundation Models for Image Retrieval in TCGA Data

Tianhao Lei, Parsa Esmaeilkhani, Saghir Alfasly, Wataru Uegami, Judy C. Boughey, Matthew P. Goetz, Krishna R. Kalari, H.R. Tizhoosh

PDF

TL;DR

This study benchmarks various whole-slide image retrieval methods on TCGA data, revealing limited advantages of foundation models over patch-based approaches and highlighting fundamental challenges in morphology-based retrieval.

Contribution

It provides a comprehensive comparison of foundation models and patch-based methods for histopathology image retrieval, emphasizing the need for multimodal and diagnosis-aware strategies.

Findings

01

Foundation models showed only modest improvements over patch-based methods.

02

Performance varied significantly across organs and diagnoses.

03

Even the best models achieved only about 68% retrieval accuracy, with some subtypes at 0%.

Abstract

Foundation models are reshaping computational histopathology, yet their value for whole-slide image retrieval relative to strong patch-based and supervised aggregation baselines remains unclear. We benchmarked ten pipelines on 9,387 diagnostic slides spanning 17 organs and 60 diagnoses from The Cancer Genome Atlas (TCGA) using patient-level leave-one-patient-out evaluation. Methods included four pre-trained slide foundation models, a supervised attention-based multiple instance learning (ABMIL) aggregator on patch embeddings, and patch-level retrieval across five sampling densities. Performance varied more across organs and diagnoses than across architectures. Although the slide foundation model TITAN achieved the strongest overall results, its advantage was modest; ABMIL and patch-based methods reached comparable Top-1 and Top-3 accuracy, with no model consistently dominant.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.