Towards More Faithful Natural Language Explanation Using Multi-Level   Contrastive Learning in VQA

Chengen Lai; Shengli Song; Shiqi Meng; Jingyang Li; Sitong Yan,; Guangneng Hu

arXiv:2312.13594·cs.CL·December 22, 2023·2 cites

Towards More Faithful Natural Language Explanation Using Multi-Level Contrastive Learning in VQA

Chengen Lai, Shengli Song, Shiqi Meng, Jingyang Li, Sitong Yan,, Guangneng Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces MCLE, a multi-level contrastive learning approach that improves the faithfulness of natural language explanations in VQA by aligning explanations more closely with visual and factual data.

Contribution

The paper proposes a novel self-supervised contrastive learning framework that enhances explanation faithfulness in VQA by leveraging multi-level semantic, image, and instance-level samples.

Findings

01

Improves explanation faithfulness and logical consistency.

02

Achieves better alignment between explanations and visual facts.

03

Outperforms existing methods on VQA-NLE benchmarks.

Abstract

Natural language explanation in visual question answer (VQA-NLE) aims to explain the decision-making process of models by generating natural language sentences to increase users' trust in the black-box systems. Existing post-hoc methods have achieved significant progress in obtaining a plausible explanation. However, such post-hoc explanations are not always aligned with human logical inference, suffering from the issues on: 1) Deductive unsatisfiability, the generated explanations do not logically lead to the answer; 2) Factual inconsistency, the model falsifies its counterfactual explanation for answers without considering the facts in images; and 3) Semantic perturbation insensitivity, the model can not recognize the semantic changes caused by small perturbations. These problems reduce the faithfulness of explanations generated by models. To address the above issues, we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

laichengen/mcle
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Graph Neural Networks