Dynamic Traceback Learning for Medical Report Generation
Shuchang Ye, Mingyuan Meng, Mingjian Li, Dagan Feng, Usman Naseem, Jinman Kim

TL;DR
This paper introduces DTrace, a novel multimodal dynamic traceback learning framework that improves medical report generation by better capturing pathological details and enabling effective zero-shot inference using only images.
Contribution
The study proposes a new traceback mechanism and dynamic learning strategy that enhance semantic supervision and modality flexibility in medical report generation models.
Findings
DTrace outperforms state-of-the-art methods on IU-Xray and MIMIC-CXR datasets.
The framework improves the accuracy of pathological detail capture.
It enables effective report generation with only images during inference.
Abstract
Automated medical report generation has demonstrated the potential to significantly reduce the workload associated with time-consuming medical reporting. Recent generative representation learning methods have shown promise in integrating vision and language modalities for medical report generation. However, when trained end-to-end and applied directly to medical image-to-text generation, they face two significant challenges: i) difficulty in accurately capturing subtle yet crucial pathological details, and ii) reliance on both visual and textual inputs during inference, leading to performance degradation in zero-shot inference when only images are available. To address these challenges, this study proposes a novel multimodal dynamic traceback learning framework (DTrace). Specifically, we introduce a traceback mechanism to supervise the semantic validity of generated content and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Image Retrieval and Classification Techniques
