Multi-level Mixture of Experts for Multimodal Entity Linking
Zhiwei Hu, V\'ictor Guti\'errez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan

TL;DR
This paper introduces a Multi-level Mixture of Experts model for Multimodal Entity Linking, effectively addressing mention ambiguity and dynamic modal content selection to improve linking accuracy.
Contribution
It proposes a novel MMoE framework with mention enhancement and dynamic feature selection modules for improved multimodal entity linking.
Findings
Outperforms state-of-the-art MEL methods in experiments
Effectively mitigates mention ambiguity issues
Dynamically selects relevant modal features
Abstract
Multimodal Entity Linking (MEL) aims to link ambiguous mentions within multimodal contexts to associated entities in a multimodal knowledge base. Existing approaches to MEL introduce multimodal interaction and fusion mechanisms to bridge the modality gap and enable multi-grained semantic matching. However, they do not address two important problems: (i) mention ambiguity, i.e., the lack of semantic content caused by the brevity and omission of key information in the mention's textual context; (ii) dynamic selection of modal content, i.e., to dynamically distinguish the importance of different parts of modal information. To mitigate these issues, we propose a Multi-level Mixture of Experts (MMoE) model for MEL. MMoE has four components: (i) the description-aware mention enhancement module leverages large language models to identify the WikiData descriptions that best match a mention,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Sentiment Analysis and Opinion Mining
