POS Tagging and its Applications for Mathematics
Ulf Sch\"oneberg, Wolfram Sperber

TL;DR
This paper presents a mathematics-aware part-of-speech tagging method tailored for mathematical publications, demonstrating its application in key phrase extraction and classification within the zbMATH database.
Contribution
It introduces a novel adaptation of NLP techniques for mathematical texts, specifically handling mathematical formulae for improved content analysis.
Findings
Effective POS tagging for mathematical language
Enhanced key phrase extraction in mathematical publications
Improved classification accuracy in zbMATH database
Abstract
Content analysis of scientific publications is a nontrivial task, but a useful and important one for scientific information services. In the Gutenberg era it was a domain of human experts; in the digital age many machine-based methods, e.g., graph analysis tools and machine-learning techniques, have been developed for it. Natural Language Processing (NLP) is a powerful machine-learning approach to semiautomatic speech and language processing, which is also applicable to mathematics. The well established methods of NLP have to be adjusted for the special needs of mathematics, in particular for handling mathematical formulae. We demonstrate a mathematics-aware part of speech tagger and give a short overview about our adaptation of NLP methods for mathematical publications. We show the use of the tools developed for key phrase extraction and classification in the database zbMATH.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
