Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation

Doanh C. Bui; Hoai Luan Pham; Vu Trung Duong Le; Tuan Hai Vu; Van Duy Tran; Khang Nguyen; Yasuhiko Nakashima

arXiv:2505.01984·cs.CV·October 30, 2025

Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation

Doanh C. Bui, Hoai Luan Pham, Vu Trung Duong Le, Tuan Hai Vu, Van Duy Tran, Khang Nguyen, Yasuhiko Nakashima

PDF

Open Access

TL;DR

This paper introduces ADaFGrad, a novel lifelong learning framework for whole slide image analysis that leverages vision-language models and gradient distillation to improve accuracy and knowledge retention in cancer diagnosis tasks.

Contribution

The study presents a new lifelong learning method for WSI analysis that combines vision-language interaction and gradient distillation, outperforming existing methods in accuracy and retention.

Findings

01

ADaFGrad outperforms state-of-the-art methods by up to +5.068% in class-incremental accuracy.

02

It demonstrates up to +40.084% improvement over baseline in accuracy.

03

ADaFGrad exhibits minimal forgetting, retaining most knowledge from previous tasks.

Abstract

Whole Slide Images (WSIs) play a crucial role in accurate cancer diagnosis and prognosis, as they provide tissue details at the cellular level. However, the rapid growth of computational tasks involving WSIs poses significant challenges. Given that WSIs are gigapixels in size, they present difficulties in terms of storage, processing, and model training. Therefore, it is essential to develop lifelong learning approaches for WSI analysis. In scenarios where slides are distributed across multiple institutes, we aim to leverage them to develop a unified online model as a computational tool for cancer diagnosis in clinical and hospital settings. In this study, we introduce ADaFGrad, a method designed to enhance lifelong learning for whole-slide image (WSI) analysis. First, we leverage pathology vision-language foundation models to develop a framework that enables interaction between a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications