Multimodal Entity Tagging with Multimodal Knowledge Base

Hao Peng; Hang Li; Lei Hou; Juanzi Li; Chao Qiao

arXiv:2201.00693·cs.IR·July 29, 2022·1 cites

Multimodal Entity Tagging with Multimodal Knowledge Base

Hao Peng, Hang Li, Lei Hou, Juanzi Li, Chao Qiao

PDF

Open Access 1 Repo

TL;DR

This paper introduces multimodal entity tagging (MET), a new task that uses a multimodal knowledge base to identify related entities in text-image pairs, supported by a new dataset and baseline methods.

Contribution

It defines the MET task, creates a dataset based on an existing multimodal knowledge base, and provides initial baseline solutions using current NLP and CV techniques.

Findings

01

The task is challenging but feasible with current methods.

02

Baseline models achieve relatively high performance.

03

Extensive experiments and analyses validate the approach.

Abstract

To enhance research on multimodal knowledge base and multimodal information processing, we propose a new task called multimodal entity tagging (MET) with a multimodal knowledge base (MKB). We also develop a dataset for the problem using an existing MKB. In an MKB, there are entities and their associated texts and images. In MET, given a text-image pair, one uses the information in the MKB to automatically identify the related entity in the text-image pair. We solve the task by using the information retrieval paradigm and implement several baselines using state-of-the-art methods in NLP and CV. We conduct extensive experiments and make analyses on the experimental results. The results show that the task is challenging, but current technologies can achieve relatively high performance. We will release the dataset, code, and models for future research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

h-peng17/mmet
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies

MethodsBalanced Selection