MLLM-HWSI: A Multimodal Large Language Model for Hierarchical Whole Slide Image Understanding
Basit Alawode, Arif Mahmood, Muaz Khalifa Al-Radi, Shahad Albastaki, Asim Khan, Muhammad Bilal, Moshira Ali Abdalla, Mohammed Bennamoun, Sajid Javed

TL;DR
MLLM-HWSI introduces a hierarchical multimodal large language model for whole slide image analysis, aligning visual features at multiple scales with pathology language to improve interpretability and performance across diagnostic tasks.
Contribution
It presents a novel hierarchical WSI-level MLLM that aligns multi-scale visual features with pathology language, enabling interpretable reasoning and state-of-the-art results.
Findings
Achieves new SOTA on 13 WSI benchmarks
Provides interpretable, evidence-grounded reasoning
Supports diverse pathology tasks like VQA and report generation
Abstract
Whole Slide Images (WSIs) exhibit hierarchical structure, where diagnostic information emerges from cellular morphology, regional tissue organization, and global context. Existing Computational Pathology (CPath) Multimodal Large Language Models (MLLMs) typically compress an entire WSI into a single embedding, which hinders fine-grained grounding and ignores how pathologists synthesize evidence across different scales. We introduce \textbf{MLLM-HWSI}, a Hierarchical WSI-level MLLM that aligns visual features with pathology language at four distinct scales, cell as word, patch as phrase, region as sentence, and WSI as paragraph to support interpretable evidence-grounded reasoning. MLLM-HWSI decomposes each WSI into multi-scale embeddings with scale-specific projectors and jointly enforces (i) a hierarchical contrastive objective and (ii) a cross-scale consistency loss, preserving semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Cell Image Analysis Techniques · Digital Imaging for Blood Diseases
