PRIOR: Prototype Representation Joint Learning from Medical Images and   Reports

Pujin Cheng; Li Lin; Junyan Lyu; Yijin Huang; Wenhan Luo; Xiaoying; Tang

arXiv:2307.12577·cs.CV·March 12, 2024·6 cites

PRIOR: Prototype Representation Joint Learning from Medical Images and Reports

Pujin Cheng, Li Lin, Junyan Lyu, Yijin Huang, Wenhan Luo, Xiaoying, Tang

PDF

Open Access 1 Repo 1 Video

TL;DR

PRIOR introduces a prototype-based joint learning framework for medical images and reports, leveraging local and global alignment, cross-modality reconstruction, and a non-auto-regressive generation paradigm to improve diverse medical vision-language tasks.

Contribution

It proposes a novel prototype representation learning framework with local alignment and cross-modality reconstruction for medical vision-language understanding.

Findings

01

Outperforms state-of-the-art methods on multiple medical vision-language tasks.

02

Effective in both supervised and zero-shot classification scenarios.

03

Enhances image-to-text retrieval, segmentation, and detection performance.

Abstract

Contrastive learning based vision-language joint pre-training has emerged as a successful representation learning strategy. In this paper, we present a prototype representation learning framework incorporating both global and local alignment between medical images and reports. In contrast to standard global multi-modality alignment methods, we employ a local alignment module for fine-grained representation. Furthermore, a cross-modality conditional reconstruction module is designed to interchange information across modalities in the training phase by reconstructing masked images and reports. For reconstructing long reports, a sentence-wise prototype memory bank is constructed, enabling the network to focus on low-level localized visual and high-level clinical linguistic features. Additionally, a non-auto-regressive generation paradigm is proposed for reconstructing non-sequential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qtacierp/prior
pytorchOfficial

Videos

PRIOR: Prototype Representation Joint Learning from Medical Images and Reports· youtube

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI

MethodsFocus