Historical Report Guided Bi-modal Concurrent Learning for Pathology Report Generation
Ling Zhang, Boxiang Yun, Qingli Li, Yan Wang

TL;DR
This paper introduces BiGen, a novel framework for pathology report generation from WSIs that leverages knowledge retrieval and bi-modal concurrent learning to improve semantic content and reduce redundancy, achieving state-of-the-art results.
Contribution
The paper proposes a bi-modal concurrent learning framework with knowledge retrieval to enhance pathology report generation from WSIs, addressing semantic content and redundancy issues.
Findings
Achieves 7.4% improvement in NLP metrics
Enhances Her-2 classification by 19.1%
Validates modules through ablation studies
Abstract
Automated pathology report generation from Whole Slide Images (WSIs) faces two key challenges: (1) lack of semantic content in visual features and (2) inherent information redundancy in WSIs. To address these issues, we propose a novel Historical Report Guided \textbf{Bi}-modal Concurrent Learning Framework for Pathology Report \textbf{Gen}eration (BiGen) emulating pathologists' diagnostic reasoning, consisting of: (1) A knowledge retrieval mechanism to provide rich semantic content, which retrieves WSI-relevant knowledge from pre-built medical knowledge bank by matching high-attention patches and (2) A bi-modal concurrent learning strategy instantiated via a learnable visual token and a learnable textual token to dynamically extract key visual features and retrieved knowledge, where weight-shared layers enable cross-modal alignment between visual features and knowledge features. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
