$M^3EL$: A Multi-task Multi-topic Dataset for Multi-modal Entity Linking
Fang Wang, Shenglin Yin, Xiaoying Bai, Minghao Hu, Tianwei Yan, Yi, Liang

TL;DR
The paper introduces $M^3EL$, a large-scale multi-modal entity linking dataset covering diverse tasks and topics, and demonstrates its effectiveness in improving model performance through a new training strategy.
Contribution
It presents a novel large-scale dataset for multi-modal entity linking with diverse topics and tasks, along with a modality-augmented training strategy to enhance model generalization.
Findings
Existing models perform poorly due to limited data and coverage.
$M^3EL$ significantly improves model accuracy across tasks.
The proposed training strategy enhances multi-modal model adaptability.
Abstract
Multi-modal Entity Linking (MEL) is a fundamental component for various downstream tasks. However, existing MEL datasets suffer from small scale, scarcity of topic types and limited coverage of tasks, making them incapable of effectively enhancing the entity linking capabilities of multi-modal models. To address these obstacles, we propose a dataset construction pipeline and publish , a large-scale dataset for MEL. includes 79,625 instances, covering 9 diverse multi-modal tasks, and 5 different topics. In addition, to further improve the model's adaptability to multi-modal tasks, We propose a modality-augmented training strategy. Utilizing as a corpus, train the model based on , and conduct a comparative analysis with an existing multi-modal baselines. Experimental results show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques
