Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers
Dominika Tkaczyk, Andrew Collins, Paraic Sheridan, Joeran Beel

TL;DR
This study evaluates and compares ten open-source bibliographic reference parsers, analyzing their performance in out-of-the-box and retrained versions, highlighting the benefits of machine learning and task-specific tuning.
Contribution
It provides a comprehensive comparison of multiple reference parsers, demonstrating the impact of machine learning and tuning on parsing accuracy.
Findings
GROBID performs best out-of-the-box with F1 0.89.
Machine learning tools have higher recall than rule-based tools.
Retraining improves parser performance across all tools.
Abstract
Bibliographic reference parsing refers to extracting machine-readable metadata, such as the names of the authors, the title, or journal name, from bibliographic reference strings. Many approaches to this problem have been proposed so far, including regular expressions, knowledge bases and supervised machine learning. Many open source reference parsers based on various algorithms are also available. In this paper, we apply, evaluate and compare ten reference parsing tools in a specific business use case. The tools are Anystyle-Parser, Biblio, CERMINE, Citation, Citation-Parser, GROBID, ParsCit, PDFSSA4MET, Reference Tagger and Science Parse, and we compare them in both their out-of-the-box versions and versions tuned to the project-specific data. According to our evaluation, the best performing out-of-the-box tool is GROBID (F1 0.89), followed by CERMINE (F1 0.83) and ParsCit (F1 0.75).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Data Quality and Management
