Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis
Lin Fan, Xun Gong, Cenyang Zheng, Yafei Ou

TL;DR
This paper introduces Tri-VQA, a novel causal reasoning framework for medical VQA that enhances answer credibility by explaining the reasoning process, demonstrated through superior results on multi-attribute medical datasets.
Contribution
It proposes a causal-based Tri-VQA model that constructs reverse questions to improve reasoning transparency in medical VQA systems.
Findings
Outperforms existing methods on multi-attribute medical datasets
Enhances reasoning transparency in VQA answers
Demonstrates robustness across multiple medical datasets
Abstract
The intersection of medical Visual Question Answering (Med-VQA) is a challenging research topic with advantages including patient engagement and clinical expert involvement for second opinions. However, existing Med-VQA methods based on joint embedding fail to explain whether their provided results are based on correct reasoning or coincidental answers, which undermines the credibility of VQA answers. In this paper, we investigate the construction of a more cohesive and stable Med-VQA structure. Motivated by causal effect, we propose a novel Triangular Reasoning VQA (Tri-VQA) framework, which constructs reverse causal questions from the perspective of "Why this answer?" to elucidate the source of the answer and stimulate more reasonable forward reasoning processes. We evaluate our method on the Endoscopic Ultrasound (EUS) multi-attribute annotated dataset from five centers, and test it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Topic Modeling
