Joint Multimodal Entity-Relation Extraction Based on Edge-enhanced Graph Alignment Network and Word-pair Relation Tagging
Li Yuan, Yi Cai, Jin Wang, Qing Li

TL;DR
This paper introduces a novel joint multimodal entity-relation extraction model that leverages edge-enhanced graph alignment and word-pair relation tagging to improve the interaction and accuracy in multimodal knowledge graph construction.
Contribution
It is the first to jointly perform MNER and MRE with an edge-enhanced graph alignment network and word-pair relation tagging, capturing richer entity and relation interactions.
Findings
The proposed model outperforms previous methods in accuracy.
Edge information improves node and edge alignment.
Joint modeling reduces error propagation.
Abstract
Multimodal named entity recognition (MNER) and multimodal relation extraction (MRE) are two fundamental subtasks in the multimodal knowledge graph construction task. However, the existing methods usually handle two tasks independently, which ignores the bidirectional interaction between them. This paper is the first to propose jointly performing MNER and MRE as a joint multimodal entity-relation extraction task (JMERE). Besides, the current MNER and MRE models only consider aligning the visual objects with textual entities in visual and textual graphs but ignore the entity-entity relationships and object-object relationships. To address the above challenges, we propose an edge-enhanced graph alignment network and a word-pair relation tagging (EEGA) for JMERE task. Specifically, we first design a word-pair relation tagging to exploit the bidirectional interaction between MNER and MRE and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Semantic Web and Ontologies
