Addressing Data Bias Problems for Chest X-ray Image Report Generation

Philipp Harzig; Yan-Ying Chen; Francine Chen; Rainer Lienhart

arXiv:1908.02123·cs.CV·August 7, 2019·28 cites

Addressing Data Bias Problems for Chest X-ray Image Report Generation

Philipp Harzig, Yan-Ying Chen, Francine Chen, Rainer Lienhart

PDF

Open Access

TL;DR

This paper proposes a hierarchical LSTM model that separately generates normal and abnormal sentences in chest X-ray reports to improve variability and address data bias issues in automatic medical report generation.

Contribution

It introduces a novel approach of separating abnormal and normal sentence generation using two different word LSTMs within a hierarchical model.

Findings

01

BLEU score increases with less distinct reports

02

Separate sentence generation improves report variability

03

Analysis highlights need for better evaluation metrics

Abstract

Automatic medical report generation from chest X-ray images is one possibility for assisting doctors to reduce their workload. However, the different patterns and data distribution of normal and abnormal cases can bias machine learning models. Previous attempts did not focus on isolating the generation of the abnormal and normal sentences in order to increase the variability of generated paragraphs. To address this, we propose to separate abnormal and normal sentence generation by using two different word LSTMs in a hierarchical LSTM model. We conduct an analysis on the distinctiveness of generated sentences compared to the BLEU score, which increases when less distinct reports are generated. We hope our findings will help to encourage the development of new metrics to better verify methods of automatic medical report generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Natural Language Processing Techniques

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory