Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation   Extraction

Ranran Haoran Zhang; Qianying Liu; Aysa Xuemo Fan; Heng Ji; Daojian; Zeng; Fei Cheng; Daisuke Kawahara; Sadao Kurohashi

arXiv:2009.07503·cs.CL·October 7, 2020

Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation Extraction

Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian, Zeng, Fei Cheng, Daisuke Kawahara, Sadao Kurohashi

PDF

1 Repo

TL;DR

This paper introduces Seq2UMTree, a novel model for joint entity and relation extraction that reduces exposure bias by limiting decoding length and removing order among triplets, leading to better generalization.

Contribution

The paper proposes Seq2UMTree, a new Seq2Seq variant that minimizes exposure bias by constraining triplet decoding and removing order, improving over prior models.

Findings

01

Seq2UMTree outperforms traditional Seq2Seq models on DuIE and NYT datasets.

02

Seq2Seq models tend to overfit due to exposure bias.

03

Limiting decoding length and removing order improves generalization.

Abstract

Joint entity and relation extraction aims to extract relation triplets from plain text directly. Prior work leverages Sequence-to-Sequence (Seq2Seq) models for triplet sequence generation. However, Seq2Seq enforces an unnecessary order on the unordered triplets and involves a large decoding length associated with error accumulation. These introduce exposure bias, which may cause the models overfit to the frequent label combination, thus deteriorating the generalization. We propose a novel Sequence-to-Unordered-Multi-Tree (Seq2UMTree) model to minimize the effects of exposure bias by limiting the decoding length to three within a triplet and removing the order among triplets. We evaluate our model on two datasets, DuIE and NYT, and systematically study how exposure bias alters the performance of Seq2Seq models. Experiments show that the state-of-the-art Seq2Seq model overfits to both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WindChimeRan/OpenJERE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence