Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven   Optimization

Yue Zhang; Liqiang Jing; Vibhav Gogate

arXiv:2412.16232·cs.CV·February 11, 2025

Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization

Yue Zhang, Liqiang Jing, Vibhav Gogate

PDF

Open Access 1 Repo

TL;DR

This paper introduces Defeasible Visual Entailment (DVE), a new task that allows models to modify image-text entailment relationships based on updates, with novel evaluators and optimization methods improving accuracy and reliability.

Contribution

It presents the first exploration of defeasible reasoning in visual entailment, along with a new evaluator and reward-driven optimization for better model updates.

Findings

01

The proposed evaluator effectively captures entailment changes.

02

Reward-driven optimization improves update quality.

03

Experimental results show enhanced accuracy and reliability.

Abstract

We introduce a new task called Defeasible Visual Entailment (DVE), where the goal is to allow the modification of the entailment relationship between an image premise and a text hypothesis based on an additional update. While this concept is well-established in Natural Language Inference, it remains unexplored in visual entailment. At a high level, DVE enables models to refine their initial interpretations, leading to improved accuracy and reliability in various applications such as detecting misleading information in images, enhancing visual question answering, and refining decision-making processes in autonomous systems. Existing metrics do not adequately capture the change in the entailment relationship brought by updates. To address this, we propose a novel inference-aware evaluator designed to capture changes in entailment strength induced by updates, using pairwise contrastive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

skywalkerzhang/Defeasible_Visual_Entailment
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman-Automation Interaction and Safety · Decision-Making and Behavioral Economics · Complex Systems and Decision Making

MethodsContrastive Learning