Medical Report Generation Is A Multi-label Classification Problem
Yijian Fan, Zhenbang Yang, Rui Liu, Mingjie Li, Xiaojun Chang

TL;DR
This paper redefines medical report generation as a multi-label classification task, utilizing knowledge graph nodes and a BLIP-based framework to improve accuracy and achieve state-of-the-art results.
Contribution
It introduces a novel classification-based framework for medical report generation, leveraging knowledge graph nodes and a BLIP model to enhance performance.
Findings
Achieved state-of-the-art performance on benchmark datasets.
Outperformed traditional sequence generation methods.
Validated the effectiveness of classification approach.
Abstract
Medical report generation is a critical task in healthcare that involves the automatic creation of detailed and accurate descriptions from medical images. Traditionally, this task has been approached as a sequence generation problem, relying on vision-and-language techniques to generate coherent and contextually relevant reports. However, in this paper, we propose a novel perspective: rethinking medical report generation as a multi-label classification problem. By framing the task this way, we leverage the radiology nodes from the commonly used knowledge graph, which can be better captured through classification techniques. To verify our argument, we introduce a novel report generation framework based on BLIP integrated with classified key nodes, which allows for effective report generation with accurate classification of multiple key aspects within the medical images. This approach not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsBLIP: Bootstrapping Language-Image Pre-training
