LaPA: Latent Prompt Assist Model For Medical Visual Question Answering

Tiancheng Gu; Kaicheng Yang; Dongnan Liu; Weidong Cai

arXiv:2404.13039·cs.CV·April 22, 2024·1 cites

LaPA: Latent Prompt Assist Model For Medical Visual Question Answering

Tiancheng Gu, Kaicheng Yang, Dongnan Liu, Weidong Cai

PDF

Open Access 1 Repo

TL;DR

This paper introduces LaPA, a novel model for medical visual question answering that leverages latent prompts and clinical knowledge fusion to improve answer accuracy on medical image datasets.

Contribution

The paper proposes a latent prompt generation and multi-modal fusion approach that enhances clinical information extraction in Med-VQA tasks, outperforming existing models.

Findings

01

LaPA achieves higher accuracy than state-of-the-art models on three Med-VQA datasets.

02

The latent prompt module effectively captures target answers and clinical relevance.

03

Incorporating prior knowledge improves the model's interpretability and performance.

Abstract

Medical visual question answering (Med-VQA) aims to automate the prediction of correct answers for medical images and questions, thereby assisting physicians in reducing repetitive tasks and alleviating their workload. Existing approaches primarily focus on pre-training models using additional and comprehensive datasets, followed by fine-tuning to enhance performance in downstream tasks. However, there is also significant value in exploring existing models to extract clinically relevant information. In this paper, we propose the Latent Prompt Assist model (LaPA) for medical visual question answering. Firstly, we design a latent prompt generation module to generate the latent prompt with the constraint of the target answer. Subsequently, we propose a multi-modal fusion block with latent prompt fusion module that utilizes the latent prompt to extract clinical-relevant information from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

garygutc/lapa_model
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Text and Document Classification Technologies

MethodsFocus