Exposing and Addressing Cross-Task Inconsistency in Unified   Vision-Language Models

Adyasha Maharana; Amita Kamath; Christopher Clark; Mohit Bansal,; Aniruddha Kembhavi

arXiv:2303.16133·cs.CV·February 23, 2024·1 cites

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

Adyasha Maharana, Amita Kamath, Christopher Clark, Mohit Bansal,, Aniruddha Kembhavi

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces CocoCon, a benchmark dataset for measuring cross-task consistency in vision-language models, revealing significant inconsistency issues and proposing a training method to improve consistency without sacrificing accuracy.

Contribution

The paper presents CocoCon, a novel benchmark for evaluating cross-task consistency, and proposes a rank correlation-based training objective to enhance model consistency across heterogeneous tasks.

Findings

01

State-of-the-art models exhibit high inconsistency across tasks.

02

The proposed training method improves multi-task consistency.

03

Models retain their original accuracy after applying the new training objective.

Abstract

As general purpose vision models get increasingly effective at a wide set of tasks, it is imperative that they be consistent across the tasks they support. Inconsistent AI models are considered brittle and untrustworthy by human users and are more challenging to incorporate into larger systems that take dependencies on their outputs. Measuring consistency between very heterogeneous tasks that might include outputs in different modalities is challenging since it is difficult to determine if the predictions are consistent with one another. As a solution, we introduce a benchmark dataset, CocoCon, where we create contrast sets by modifying test instances for multiple tasks in small but semantically meaningful ways to change the gold label and outline metrics for measuring if a model is consistent by ranking the original and perturbed instances across tasks. We find that state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adymaharana/cococon
jaxOfficial

Datasets

adymaharana/cococon
dataset· 69 dl
69 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Constraint Satisfaction and Optimization · Robotics and Automated Systems

MethodsTest