Detecting Entities in the Astrophysics Literature: A Comparison of   Word-based and Span-based Entity Recognition Methods

Xiang Dai; Sarvnaz Karimi

arXiv:2211.13819·cs.CL·November 28, 2022·1 cites

Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods

Xiang Dai, Sarvnaz Karimi

PDF

Open Access

TL;DR

This paper compares word-based and span-based entity recognition methods for extracting named entities from astrophysics literature, demonstrating their effectiveness through empirical evaluation on a shared task dataset.

Contribution

It provides an empirical comparison of word-based versus span-based entity recognition methods specifically applied to astrophysics literature.

Findings

01

Best F1 score of 0.8307 on validation set

02

Best F1 score of 0.7990 on test set

03

Demonstrates effectiveness of span-based methods in scientific literature

Abstract

Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text. We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task. The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics literature. We planned our participation such that it enables us to conduct an empirical comparison between word-based tagging and span-based classification methods. When evaluated on two hidden test sets provided by the organizer, our best-performing submission achieved $F_{1}$ scores of 0.8307 (validation phase) and 0.7990 (testing phase).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques

MethodsTest