Addressing Data Bias Problems for Chest X-ray Image Report Generation
Philipp Harzig, Yan-Ying Chen, Francine Chen, Rainer Lienhart

TL;DR
This paper proposes a hierarchical LSTM model that separately generates normal and abnormal sentences in chest X-ray reports to improve variability and address data bias issues in automatic medical report generation.
Contribution
It introduces a novel approach of separating abnormal and normal sentence generation using two different word LSTMs within a hierarchical model.
Findings
BLEU score increases with less distinct reports
Separate sentence generation improves report variability
Analysis highlights need for better evaluation metrics
Abstract
Automatic medical report generation from chest X-ray images is one possibility for assisting doctors to reduce their workload. However, the different patterns and data distribution of normal and abnormal cases can bias machine learning models. Previous attempts did not focus on isolating the generation of the abnormal and normal sentences in order to increase the variability of generated paragraphs. To address this, we propose to separate abnormal and normal sentence generation by using two different word LSTMs in a hierarchical LSTM model. We conduct an analysis on the distinctiveness of generated sentences compared to the BLEU score, which increases when less distinct reports are generated. We hope our findings will help to encourage the development of new metrics to better verify methods of automatic medical report generation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Natural Language Processing Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
