Embedding Models for Supervised Automatic Extraction and Classification of Named Entities in Scientific Acknowledgements
Nina Smirnova, Philipp Mayr

TL;DR
This paper evaluates embedding models for automatic extraction and classification of acknowledged entities in scientific papers, demonstrating that medium-sized training corpora optimize accuracy with a model recognizing six entity types.
Contribution
It introduces a neural NER model trained on acknowledgment texts, showing the impact of corpus size and providing a tool for automated acknowledgment analysis.
Findings
Best accuracy of 0.79 achieved with Flair Embeddings on medium corpus
Expanding training data from very small to medium size significantly improves performance
Model recognizes six entity types with high precision for some, like individuals and grant numbers
Abstract
Acknowledgments in scientific papers may give an insight into aspects of the scientific community, such as reward systems, collaboration patterns, and hidden research trends. The aim of the paper is to evaluate the performance of different embedding models for the task of automatic extraction and classification of acknowledged entities from the acknowledgment text in scientific papers. We trained and implemented a named entity recognition (NER) task using the Flair NLP framework. The training was conducted using three default Flair NER models with four differently-sized corpora and different versions of the Flair NLP framework. The Flair Embeddings model trained on the medium corpus with the latest FLAIR version showed the best accuracy of 0.79. Expanding the size of a training corpus from very small to medium size massively increased the accuracy of all training algorithms, but further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Advanced Text Analysis Techniques
