Analogical Reasoning as a Doctor: A Foundation Model for Gastrointestinal Endoscopy Diagnosis

Peixi Peng (1); Housheng Xie (1); Yanling Wei (2); Guangcong Ruan (2); Xiaoyang Zou (1); Qian Cao (3); Yongjian Nian (2); Guoyan Zheng (1) ((1) Institute of Medical Robotics; School of Biomedical Engineering; Shanghai Jiao Tong University; (2) Daping Hospital; Army Medical University; (3) Sir Run Run Shaw Hospital; Zhejiang University School of Medicine)

arXiv:2604.05649·cs.CV·April 8, 2026

Analogical Reasoning as a Doctor: A Foundation Model for Gastrointestinal Endoscopy Diagnosis

Peixi Peng (1), Housheng Xie (1), Yanling Wei (2), Guangcong Ruan (2), Xiaoyang Zou (1), Qian Cao (3), Yongjian Nian (2), Guoyan Zheng (1) ((1) Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, (2) Daping Hospital

PDF

TL;DR

RATNet is a novel foundation model for gastrointestinal endoscopy diagnosis that leverages analogical reasoning to improve generalization, adaptability, and robustness across diverse datasets and scenarios.

Contribution

The paper introduces RATNet, a foundation model that transfers knowledge from heterogeneous annotations using a cyclic pre-training strategy, enhancing endoscopic diagnosis performance.

Findings

01

RATNet outperforms existing models like GastroNet and GastroVision.

02

It demonstrates strong zero-shot transfer and few-shot learning capabilities.

03

RATNet maintains robustness across long-tailed disease distributions.

Abstract

Gastrointestinal diseases impose a growing global health burden, and endoscopy is a primary tool for early diagnosis. However, routine endoscopic image interpretation still suffers from missed lesions and limited efficiency. Although AI-assisted diagnosis has shown promise, existing models often lack generalizability, adaptability, robustness, and scalability because of limited medical data, domain shift, and heterogeneous annotations. To address these challenges, we develop RATNet, a foundation model for gastrointestinal endoscopy imaging based on analogical reasoning. RATNet acquires and transfers knowledge from heterogeneous expert annotations across five gastrointestinal endoscopy datasets through a cyclic pre-training strategy. Its architecture consists of an encoder, a relevance-knowledge acquisition and transfer (RAT) module, a projector, and a multi-task head, and supports…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.