Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai, Sarvnaz Karimi

TL;DR
This paper compares word-based and span-based entity recognition methods for extracting named entities from astrophysics literature, demonstrating their effectiveness through empirical evaluation on a shared task dataset.
Contribution
It provides an empirical comparison of word-based versus span-based entity recognition methods specifically applied to astrophysics literature.
Findings
Best F1 score of 0.8307 on validation set
Best F1 score of 0.7990 on test set
Demonstrates effectiveness of span-based methods in scientific literature
Abstract
Information Extraction from scientific literature can be challenging due to the highly specialised nature of such text. We describe our entity recognition methods developed as part of the DEAL (Detecting Entities in the Astrophysics Literature) shared task. The aim of the task is to build a system that can identify Named Entities in a dataset composed by scholarly articles from astrophysics literature. We planned our participation such that it enables us to conduct an empirical comparison between word-based tagging and span-based classification methods. When evaluated on two hidden test sets provided by the organizer, our best-performing submission achieved scores of 0.8307 (validation phase) and 0.7990 (testing phase).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
MethodsTest
