Solving Label Variation in Scientific Information Extraction via   Multi-Task Learning

Dong Pham; Xanh Ho; Quang-Thuy Ha; Akiko Aizawa

arXiv:2312.15751·cs.CL·December 27, 2023·1 cites

Solving Label Variation in Scientific Information Extraction via Multi-Task Learning

Dong Pham, Xanh Ho, Quang-Thuy Ha, Akiko Aizawa

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multi-task learning approach with soft labeling to address label variation issues in Scientific Information Extraction, improving robustness and performance despite limited annotated data.

Contribution

It proposes a novel multi-task learning framework with soft labeling to handle label conflicts and noise in ScientificIE datasets, enhancing model robustness and accuracy.

Findings

01

Improved model performance on ScientificIE tasks.

02

Enhanced robustness to label noise.

03

Potential reduction in data annotation requirements.

Abstract

Scientific Information Extraction (ScientificIE) is a critical task that involves the identification of scientific entities and their relationships. The complexity of this task is compounded by the necessity for domain-specific knowledge and the limited availability of annotated data. Two of the most popular datasets for ScientificIE are SemEval-2018 Task-7 and SciERC. They have overlapping samples and differ in their annotation schemes, which leads to conflicts. In this study, we first introduced a novel approach based on multi-task learning to address label variations. We then proposed a soft labeling technique that converts inconsistent labels into probabilistic distributions. The experimental results demonstrated that the proposed method can enhance the model robustness to label noise and improve the end-to-end performance in both ScientificIE tasks. The analysis revealed that label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dongpham120899/labelvariation_sciie
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies