Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
Adyasha Maharana, Amita Kamath, Christopher Clark, Mohit Bansal,, Aniruddha Kembhavi

TL;DR
This paper introduces CocoCon, a benchmark dataset for measuring cross-task consistency in vision-language models, revealing significant inconsistency issues and proposing a training method to improve consistency without sacrificing accuracy.
Contribution
The paper presents CocoCon, a novel benchmark for evaluating cross-task consistency, and proposes a rank correlation-based training objective to enhance model consistency across heterogeneous tasks.
Findings
State-of-the-art models exhibit high inconsistency across tasks.
The proposed training method improves multi-task consistency.
Models retain their original accuracy after applying the new training objective.
Abstract
As general purpose vision models get increasingly effective at a wide set of tasks, it is imperative that they be consistent across the tasks they support. Inconsistent AI models are considered brittle and untrustworthy by human users and are more challenging to incorporate into larger systems that take dependencies on their outputs. Measuring consistency between very heterogeneous tasks that might include outputs in different modalities is challenging since it is difficult to determine if the predictions are consistent with one another. As a solution, we introduce a benchmark dataset, CocoCon, where we create contrast sets by modifying test instances for multiple tasks in small but semantically meaningful ways to change the gold label and outline metrics for measuring if a model is consistent by ranking the original and perturbed instances across tasks. We find that state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Constraint Satisfaction and Optimization · Robotics and Automated Systems
MethodsTest
