Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology
Sobhan Hemati, Ghazal Alabtah, Saghir Alfasly, H.R. Tizhoosh

TL;DR
This paper evaluates various set-based aggregation techniques for converting multiple patch embeddings into a single WSI representation to improve image retrieval in digital pathology, benchmarking their performance across different cancer types.
Contribution
It provides a comprehensive comparison of recent aggregation methods for WSI representation learning, highlighting their effectiveness in pathology image retrieval tasks.
Findings
Deep Sets and attention-based methods outperform simple pooling.
Fisher Vector approaches show competitive retrieval accuracy.
Non-aggregating median of minimum distances remains a strong baseline.
Abstract
A crucial step to efficiently integrate Whole Slide Images (WSIs) in computational pathology is assigning a single high-quality feature vector, i.e., one embedding, to each WSI. With the existence of many pre-trained deep neural networks and the emergence of foundation models, extracting embeddings for sub-images (i.e., tiles or patches) is straightforward. However, for WSIs, given their high resolution and gigapixel nature, inputting them into existing GPUs as a single image is not feasible. As a result, WSIs are usually split into many patches. Feeding each patch to a pre-trained model, each WSI can then be represented by a set of patches, hence, a set of embeddings. Hence, in such a setup, WSI representation learning reduces to set representation learning where for each WSI we have access to a set of patch embeddings. To obtain a single embedding from a set of patch embeddings for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection
MethodsMax Pooling · Sparse Evolutionary Training · Deep Sets
