Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning
Zihua Zhao, Mengxi Chen, Tianjie Dai, Jiangchao Yao, Bo han, Ya Zhang,, Yanfeng Wang

TL;DR
This paper introduces a Geometrical Structure Consistency method that leverages intra- and inter-modal geometrical structures to identify and mitigate noisy correspondences in cross-modal datasets, improving learning accuracy.
Contribution
The paper proposes a novel GSC approach that preserves and utilizes geometrical structures within and across modalities to effectively detect and filter noisy data in multimodal learning.
Findings
GSC outperforms existing methods in noisy correspondence detection.
GSC effectively preserves geometrical structures to discriminate noise.
Experiments on four datasets validate the approach's robustness.
Abstract
Noisy correspondence that refers to mismatches in cross-modal data pairs, is prevalent on human-annotated or web-crawled datasets. Prior approaches to leverage such data mainly consider the application of uni-modal noisy label learning without amending the impact on both cross-modal and intra-modal geometrical structures in multimodal learning. Actually, we find that both structures are effective to discriminate noisy correspondence through structural differences when being well-established. Inspired by this observation, we introduce a Geometrical Structure Consistency (GSC) method to infer the true correspondence. Specifically, GSC ensures the preservation of geometrical structures within and between modalities, allowing for the accurate discrimination of noisy samples based on structural differences. Utilizing these inferred true correspondence labels, GSC refines the learning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Face and Expression Recognition · Image Retrieval and Classification Techniques
