Multimodal Relational Triple Extraction with Query-based Entity Object   Transformer

Lei Hei; Ning An; Tingjing Liao; Qi Ma; Jiaqi Wang; Feiliang Ren

arXiv:2408.08709·cs.IR·August 19, 2024

Multimodal Relational Triple Extraction with Query-based Entity Object Transformer

Lei Hei, Ning An, Tingjing Liao, Qi Ma, Jiaqi Wang, Feiliang Ren

PDF

Open Access

TL;DR

This paper introduces a novel multimodal relation extraction task and a query-based model that jointly extracts entity-object triples from image-text pairs, improving accuracy and efficiency over previous methods.

Contribution

The paper proposes a new task, a modified dataset, and a query-based model with attention for joint extraction of entities, relations, and objects from multimodal data.

Findings

01

Outperforms existing baselines by 8.06%

02

Creates a new dataset with 20,264 triples

03

Achieves state-of-the-art performance

Abstract

Multimodal Relation Extraction is crucial for constructing flexible and realistic knowledge graphs. Recent studies focus on extracting the relation type with entity pairs present in different modalities, such as one entity in the text and another in the image. However, existing approaches require entities and objects given beforehand, which is costly and impractical. To address the limitation, we propose a novel task, Multimodal Entity-Object Relational Triple Extraction, which aims to extract all triples (entity span, relation, object region) from image-text pairs. To facilitate this study, we modified a multimodal relation extraction dataset MORE, which includes 21 relation types, to create a new dataset containing 20,264 triples, averaging 5.75 triples per image-text pair. Moreover, we propose QEOT, a query-based model with a selective attention mechanism, to dynamically explore the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Semantic Web and Ontologies · Natural Language Processing Techniques

MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training · Focus