Progressive Transformer-Based Generation of Radiology Reports

Farhad Nooralahzadeh; Nicolas Perez Gonzalez; Thomas Frauenfelder,; Koji Fujimoto; Michael Krauthammer

arXiv:2102.09777·cs.CL·September 1, 2021

Progressive Transformer-Based Generation of Radiology Reports

Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder,, Koji Fujimoto, Michael Krauthammer

PDF

1 Repo 1 Models

TL;DR

This paper introduces a progressive, two-step transformer-based framework for radiology report generation, improving accuracy by first extracting global concepts from images and then refining them into detailed reports, inspired by Curriculum Learning.

Contribution

It presents a novel image-to-text-to-text generation approach that divides report creation into global concept extraction and detailed report refinement, outperforming previous methods.

Findings

01

Achieved state-of-the-art results on two benchmark datasets.

02

Demonstrated the effectiveness of a two-step generation process.

03

Improved coherence and accuracy in radiology report generation.

Abstract

Inspired by Curriculum Learning, we propose a consecutive (i.e., image-to-text-to-text) generation framework where we divide the problem of radiology report generation into two steps. Contrary to generating the full radiology report from the image at once, the model generates global concepts from the image in the first step and then reforms them into finer and coherent texts using a transformer architecture. We follow the transformer-based sequence-to-sequence paradigm at each step. We improve upon the state-of-the-art on two benchmark datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uzh-dqbm-cmi/argon
pytorchOfficial

Models

🤗
dheeena/MRI
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques