Learning with Dual-level Noisy Correspondence for Multi-modal Entity Alignment
Haobin Li, Yijie Lin, Peng Hu, Mouxing Yang, Xi Peng

TL;DR
This paper introduces RULE, a robust framework for multi-modal entity alignment that effectively handles dual-level noisy correspondences in knowledge graphs, significantly improving accuracy over existing methods.
Contribution
The paper proposes a novel approach to address dual-level noisy correspondences in multi-modal entity alignment, incorporating reliability estimation and a correspondence reasoning module.
Findings
RULE outperforms seven state-of-the-art methods on five benchmarks.
The framework effectively mitigates intra-entity and inter-graph noise impacts.
Experimental results demonstrate significant accuracy improvements.
Abstract
Multi-modal entity alignment (MMEA) aims to identify equivalent entities across heterogeneous multi-modal knowledge graphs (MMKGs), where each entity is described by attributes from various modalities. Existing methods typically assume that both intra-entity and inter-graph correspondences are faultless, which is often violated in real-world MMKGs due to the reliance on expert annotations. In this paper, we reveal and study a highly practical yet under-explored problem in MMEA, termed Dual-level Noisy Correspondence (DNC). DNC refers to misalignments in both intra-entity (entity-attribute) and inter-graph (entity-entity and attribute-attribute) correspondences. To address the DNC problem, we propose a robust MMEA framework termed RULE. RULE first estimates the reliability of both intra-entity and inter-graph correspondences via a dedicated two-fold principle. Leveraging the estimated…
Peer Reviews
Decision·ICLR 2026 Oral
S1. The paper convincingly argues that Dual-level Noisy Correspondence is ubiquitous in real MMKG scenarios, yet largely ignored by prior MMEA works. The motivation is well-founded and supported by quantitative evidence of high noise ratios. S2. RULE provides an end-to-end framework that integrates reliability estimation, noise-aware learning, multi-modal fusion, and test-time reasoning. The combination of training-time robustness and inference-time semantic reasoning is novel and compelling.
W1. RULE contains several components (uncertainty estimation, consensus reasoning, robust learning, fusion, test-time MLLM reasoning). While each part is motivated, the overall system introduces significant complexity, making it hard to isolate which component contributes most. W2. Although TTR appears effective, the paper lacks a thorough analysis of computational overhead, scalability, and failure modes. Using a large MLLM at inference time raises concerns for deployment. W3. Although TTR is
1. The identification and study of dual-level noisy correspondence is a novel and significant contribution to the MMEA task. It acknowledges that real-world knowledge graphs often contain significant noise both within entities and across graphs, which previous methods have largely ignored. 2. The proposed RULE framework leverages reliability estimation using uncertainty and consensus to combat noisy correspondences. The test-time correspondence reasoning module is a key innovation that improves
1. This paper includes some parameter analysis like trade-off parameter λ, threshold β, and temperature τ, but there is no comprehensive discussion of how these hyperparameters affect the model’s robustness under various noise settings. For instance, how does RULE perform with different datasets or in cases where the noise level is very high, e.g., >50%? 2. RULE relies heavily on pre-trained CLIP models for image and text embeddings. While this is a reasonable approach, it could limit the flexi
1. This paper is well-motivated and is clearly organized. 2. The intuitive illustrations in Figure 1 make the DNC phenomenon and its negative impacts clear. 3. The experiments are convincing, e.g., the visualizations (Figure 3b) of the reliability estimation, pair division process (Figure 4) further prove the effectiveness of the proposed RULE. 4. Most impressively, Appendix B not only analyzes the noise statistics in real-world datasets but also discusses the underlying causes of DNC. Correspon
1. Since both uncertainty and consensus are estimated through cross-graph relationships, I don’t understand why such relationships could be employed to estimate reliability of intra-graph attributes. A more intuitive and detailed explanation of this design would be very helpful for understanding. 2. In my understanding, the MLLM-based reasoning module aims to uncover underlying associations between attributes. I am interested in what kinds of associations can be further mined by the MLLM, for ex
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
