Discovering Pathology Rationale and Token Allocation for Efficient Multimodal Pathology Reasoning
Zhe Xu, Cheng Jin, Yihui Wang, Ziyi Liu, Hao Chen

TL;DR
This paper introduces a bilateral reinforcement learning framework for multimodal pathology understanding, significantly improving reasoning accuracy and reducing computational costs in diagnostic tasks.
Contribution
It proposes a novel dual-branch approach that enhances reasoning and optimizes token allocation without explicit supervision, advancing multimodal pathology analysis.
Findings
Achieved +41.7% performance improvement
Reduced inference costs by 70.3%
Effective across multiple pathological tasks
Abstract
Multimodal pathological image understanding has garnered widespread interest due to its potential to improve diagnostic accuracy and enable personalized treatment through integrated visual and textual data. However, existing methods exhibit limited reasoning capabilities, which hamper their ability to handle complex diagnostic scenarios. Additionally, the enormous size of pathological images leads to severe computational burdens, further restricting their practical deployment. To address these limitations, we introduce a novel bilateral reinforcement learning framework comprising two synergistic branches. One reinforcement branch enhances the reasoning capability by enabling the model to learn task-specific decision processes, i.e., pathology rationales, directly from labels without explicit reasoning supervision. While the other branch dynamically allocates a tailored number of tokens…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
MethodsBalanced Selection
