Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation
Qingyu Tan, Ruidan He, Lidong Bing, Hwee Tou Ng

TL;DR
This paper introduces a semi-supervised approach for document-level relation extraction that leverages axial attention, adaptive focal loss, and knowledge distillation to improve performance on complex multi-sentence relation extraction tasks.
Contribution
It presents a novel semi-supervised framework with three key components: axial attention, adaptive focal loss, and knowledge distillation, advancing the state-of-the-art in DocRE.
Findings
Outperforms previous SOTA by 1.36 F1 and 1.46 Ign_F1 scores on DocRED.
Effectively handles class imbalance with adaptive focal loss.
Improves extraction of two-hop relations using axial attention.
Abstract
Document-level Relation Extraction (DocRE) is a more challenging task compared to its sentence-level counterpart. It aims to extract relations from multiple sentences at once. In this paper, we propose a semi-supervised framework for DocRE with three novel components. Firstly, we use an axial attention module for learning the interdependency among entity-pairs, which improves the performance on two-hop relations. Secondly, we propose an adaptive focal loss to tackle the class imbalance problem of DocRE. Lastly, we use knowledge distillation to overcome the differences between human annotated data and distantly supervised data. We conducted experiments on two DocRE datasets. Our model consistently outperforms strong baselines and its performance exceeds the previous SOTA by 1.36 F1 and 1.46 Ign_F1 score on the DocRED leaderboard. Our code and data will be released at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsFocal Loss · Knowledge Distillation · Axial Attention
