Structure Observation Driven Image-Text Contrastive Learning for Computed Tomography Report Generation
Hong Liu, Dong Wei, Qiong Peng, Yawen Huang, Xian Wu, Yefeng Zheng, Liansheng Wang

TL;DR
This paper presents a novel two-stage structure observation driven contrastive learning framework for CT report generation, improving semantic alignment between images and reports and achieving state-of-the-art results.
Contribution
Introduces a structure-wise contrastive learning approach with structure-specific queries and a negative queue, enhancing CT report generation accuracy.
Findings
Achieves new state-of-the-art performance on public datasets.
Effective structure-level semantic correspondence learning.
Improved clinical efficiency in report generation.
Abstract
Computed Tomography Report Generation (CTRG) aims to automate the clinical radiology reporting process, thereby reducing the workload of report writing and facilitating patient care. While deep learning approaches have achieved remarkable advances in X-ray report generation, their effectiveness may be limited in CTRG due to larger data volumes of CT images and more intricate details required to describe them. This work introduces a novel two-stage (structure- and report-learning) framework tailored for CTRG featuring effective structure-wise image-text contrasting. In the first stage, a set of learnable structure-specific visual queries observe corresponding structures in a CT image. The resulting observation tokens are contrasted with structure-specific textual features extracted from the accompanying radiology report with a structure-wise image-text contrastive loss. In addition,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Topic Modeling
