Solving Label Variation in Scientific Information Extraction via Multi-Task Learning
Dong Pham, Xanh Ho, Quang-Thuy Ha, Akiko Aizawa

TL;DR
This paper introduces a multi-task learning approach with soft labeling to address label variation issues in Scientific Information Extraction, improving robustness and performance despite limited annotated data.
Contribution
It proposes a novel multi-task learning framework with soft labeling to handle label conflicts and noise in ScientificIE datasets, enhancing model robustness and accuracy.
Findings
Improved model performance on ScientificIE tasks.
Enhanced robustness to label noise.
Potential reduction in data annotation requirements.
Abstract
Scientific Information Extraction (ScientificIE) is a critical task that involves the identification of scientific entities and their relationships. The complexity of this task is compounded by the necessity for domain-specific knowledge and the limited availability of annotated data. Two of the most popular datasets for ScientificIE are SemEval-2018 Task-7 and SciERC. They have overlapping samples and differ in their annotation schemes, which leads to conflicts. In this study, we first introduced a novel approach based on multi-task learning to address label variations. We then proposed a soft labeling technique that converts inconsistent labels into probabilistic distributions. The experimental results demonstrated that the proposed method can enhance the model robustness to label noise and improve the end-to-end performance in both ScientificIE tasks. The analysis revealed that label…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies
