Multi-Grained Multimodal Interaction Network for Entity Linking
Pengfei Luo, Tong Xu, Shiwei Wu, Chen Zhu, Linli Xu, Enhong Chen

TL;DR
This paper introduces MIMIC, a novel multimodal interaction network that enhances entity linking by capturing fine-grained features and reducing modality inconsistency through specialized interaction units and contrastive learning.
Contribution
The paper proposes a multi-grained multimodal interaction framework with three novel modules and a contrastive learning objective to improve entity linking accuracy and robustness.
Findings
Outperforms state-of-the-art methods on three benchmark datasets.
Effective in capturing detailed textual and visual features.
Reduces modality inconsistency through contrastive learning.
Abstract
Multimodal entity linking (MEL) task, which aims at resolving ambiguous mentions to a multimodal knowledge graph, has attracted wide attention in recent years. Though large efforts have been made to explore the complementary effect among multiple modalities, however, they may fail to fully absorb the comprehensive expression of abbreviated textual context and implicit visual indication. Even worse, the inevitable noisy data may cause inconsistency of different modalities during the learning process, which severely degenerates the performance. To address the above issues, in this paper, we propose a novel Multi-GraIned Multimodal InteraCtion Network framework for solving the MEL task. Specifically, the unified inputs of mentions and entities are first encoded by textual/visual encoders separately, to extract global descriptive features and local detailed features.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Biomedical Text Mining and Ontologies
Methodsfail · Contrastive Learning
