IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training
Che Liu, Sibo Cheng, Miaojing Shi, Anand Shah, Wenjia Bai, Rossella, Arcucci

TL;DR
IMITATE leverages the hierarchical structure of clinical reports to improve vision-language pre-training in medical imaging, leading to better alignment and performance across multiple datasets and tasks.
Contribution
The paper introduces a novel hierarchical vision-language pre-training framework, IMITATE, that incorporates clinical report structure and a clinical-informed contrastive loss for enhanced medical image understanding.
Findings
Outperforms baseline methods on six datasets.
Improves performance across five medical imaging tasks.
Effectively leverages report hierarchy for better alignment.
Abstract
In the field of medical Vision-Language Pre-training (VLP), significant efforts have been devoted to deriving text and image features from both clinical reports and associated medical images. However, most existing methods may have overlooked the opportunity in leveraging the inherent hierarchical structure of clinical reports, which are generally split into `findings' for descriptive content and `impressions' for conclusive observation. Instead of utilizing this rich, structured format, current medical VLP approaches often simplify the report into either a unified entity or fragmented tokens. In this work, we propose a novel clinical prior guided VLP framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment. The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Radiomics and Machine Learning in Medical Imaging · COVID-19 diagnosis using AI
