Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Mingjie Li, Fuyu Wang, Xiaojun Chang, Xiaodan Liang

TL;DR
This paper introduces ASGK, a novel model that mimics radiologists' focus and integrates medical knowledge for improved automatic medical report generation from images.
Contribution
The proposed ASGK model uniquely combines internal visual features and external medical language information to enhance report accuracy and relevance.
Findings
Outperforms state-of-the-art methods on medical terminology classification.
Generates more accurate and coherent medical reports.
Effective on multiple datasets including COVID-19 CT reports.
Abstract
Beyond the common difficulties faced in the natural image captioning, medical report generation specifically requires the model to describe a medical image with a fine-grained and semantic-coherence paragraph that should satisfy both medical commonsense and logic. Previous works generally extract the global image features and attempt to generate a paragraph that is similar to referenced reports; however, this approach has two limitations. Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure. Secondly, there are many similar sentences used in each medical report to describe the normal regions of the image, which causes serious data bias. This deviation is likely to teach models to generate these inessential sentences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
