Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment
Zhanghexuan Ji, Mohammad Abuzar Shaikh, Dana Moukheiber, Sargur, Srihari, Yifan Peng, Mingchen Gao

TL;DR
This paper introduces JoImTeRNet, a self-supervised joint learning framework for chest X-ray images and radiology reports, leveraging global and local visual-textual alignment to improve downstream medical imaging tasks.
Contribution
The paper presents a novel multi-modal pre-training method that aligns image regions with report words using attention, enhancing representation learning without manual supervision.
Findings
Improved cross-modality retrieval performance
Enhanced multi-label classification accuracy
Effective local and global alignment of image and text features
Abstract
Self-supervised learning provides an opportunity to explore unlabeled chest X-rays and their associated free-text reports accumulated in clinical routine without manual supervision. This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports. The model was pre-trained on both the global image-sentence level and the local image region-word level for visual-textual matching. Both are bidirectionally constrained on Cross-Entropy based and ranking-based Triplet Matching Losses. The region-word matching is calculated using the attention mechanism without direct supervision about their mapping. The pre-trained multi-modal representation learning paves the way for downstream tasks concerning image and/or text encoding. We demonstrate the representation learning quality by cross-modality retrievals and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · COVID-19 diagnosis using AI
