Consistency-preserving Visual Question Answering in Medical Imaging

Sergio Tascon-Morales; Pablo M\'arquez-Neila; Raphael Sznitman

arXiv:2206.13296·cs.CV·June 28, 2022

Consistency-preserving Visual Question Answering in Medical Imaging

Sergio Tascon-Morales, Pablo M\'arquez-Neila, Raphael Sznitman

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel training method for medical VQA models that enhances answer consistency and accuracy by incorporating known relations between questions, demonstrated on diabetic macular edema staging.

Contribution

It proposes a new loss function and training procedure that integrate question relations to improve consistency and accuracy in medical VQA systems.

Findings

01

Outperforms state-of-the-art baselines in consistency and accuracy

02

Improves trustworthiness of medical VQA models

03

Validated on diabetic macular edema staging

Abstract

Visual Question Answering (VQA) models take an image and a natural-language question as input and infer the answer to the question. Recently, VQA systems in medical imaging have gained popularity thanks to potential advantages such as patient engagement and second opinions for clinicians. While most research efforts have been focused on improving architectures and overcoming data-related limitations, answer consistency has been overlooked even though it plays a critical role in establishing trustworthy models. In this work, we propose a novel loss function and corresponding training procedure that allows the inclusion of relations between questions into the training process. Specifically, we consider the case where implications between perception and reasoning questions are known a-priori. To show the benefits of our approach, we evaluate it on the clinically relevant task of Diabetic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sergiotasconmorales/consistency_vqa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning