Diagnostic Captioning: A Survey

John Pavlopoulos; Vasiliki Kougia; Ion Androutsopoulos; Dimitris; Papamichail

arXiv:2101.07299·cs.CV·January 20, 2021

Diagnostic Captioning: A Survey

John Pavlopoulos, Vasiliki Kougia, Ion Androutsopoulos, Dimitris, Papamichail

PDF

Open Access

TL;DR

Diagnostic Captioning automates the generation of diagnostic texts from medical images, aiding physicians and reducing errors, with recent advances driven by deep learning and image captioning techniques.

Contribution

This survey provides a comprehensive overview of Diagnostic Captioning, including datasets, evaluation metrics, recent systems, shortcomings, and future research directions.

Findings

01

Several datasets available for DC evaluation

02

Deep learning has significantly advanced DC systems

03

Identified key challenges and future research directions

Abstract

Diagnostic Captioning (DC) concerns the automatic generation of a diagnostic text from a set of medical images of a patient collected during an examination. DC can assist inexperienced physicians, reducing clinical errors. It can also help experienced physicians produce diagnostic reports faster. Following the advances of deep learning, especially in generic image captioning, DC has recently attracted more attention, leading to several systems and datasets. This article is an extensive overview of DC. It presents relevant datasets, evaluation measures, and up to date systems. It also highlights shortcomings that hinder DC's progress and proposes future directions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications