HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for   Vision-Language Data Cleansing

Zihao Zhu; Hongbao Zhang; Guanzong Wu; Siwei Lyu; Baoyuan Wu

arXiv:2412.05685·cs.CV·December 10, 2024

HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for Vision-Language Data Cleansing

Zihao Zhu, Hongbao Zhang, Guanzong Wu, Siwei Lyu, Baoyuan Wu

PDF

Open Access

TL;DR

HMGIE is an adaptive, hierarchical framework that evaluates visual-textual inconsistencies at multiple granularities, improving data cleansing for diverse vision-language datasets.

Contribution

The paper introduces HMGIE, a novel multi-grained inconsistency evaluation framework with a semantic graph and hierarchical assessment modules for better vision-language data cleansing.

Findings

01

HMGIE outperforms existing methods on benchmark datasets.

02

The framework effectively handles diverse inconsistency types.

03

Constructed MVTID dataset for comprehensive evaluation.

Abstract

Visual-textual inconsistency (VTI) evaluation plays a crucial role in cleansing vision-language data. Its main challenges stem from the high variety of image captioning datasets, where differences in content can create a range of inconsistencies (\eg, inconsistencies in scene, entities, entity attributes, entity numbers, entity interactions). Moreover, variations in caption length can introduce inconsistencies at different levels of granularity as well. To tackle these challenges, we design an adaptive evaluation framework, called Hierarchical and Multi-Grained Inconsistency Evaluation (HMGIE), which can provide multi-grained evaluations covering both accuracy and completeness for various image-caption pairs. Specifically, the HMGIE framework is implemented by three consecutive modules. Firstly, the semantic graph generation module converts the image caption to a semantic graph for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management