ARQMath Lab: An Incubator for Semantic Formula Search in zbMATH Open?
Philipp Scharpf, Moritz Schubotz, Andre Greiner-Petter, Malte, Ostendorff, Olaf Teschke, Bela Gipp

TL;DR
This paper discusses the development of the ARQMath Lab as an incubator for improving semantic formula search in zbMATH, exploring index structures, entity linking, and formula retrieval methods to better satisfy mathematical information needs.
Contribution
It introduces the ARQMath evaluation framework and investigates formula retrieval techniques, including MOI-based search, to enhance mathematical information retrieval.
Findings
Manual runs provided insights into user needs and answer types.
MOI search shows promising potential despite low competition scores.
Further research into MOI is motivated by perceived quality of results.
Abstract
The zbMATH database contains more than 4 million bibliographic entries. We aim to provide easy access to these entries. Therefore, we maintain different index structures, including a formula index. To optimize the findability of the entries in our database, we continuously investigate new approaches to satisfy the information needs of our users. We believe that the findings from the ARQMath evaluation will generate new insights into which index structures are most suitable to satisfy mathematical information needs. Search engines, recommender systems, plagiarism checking software, and many other added-value services acting on databases such as the arXiv and zbMATH need to combine natural and formula language. One initial approach to address this challenge is to enrich the mostly unstructured document data via Entity Linking. The ARQMath Task at CLEF 2020 aims to tackle the problem of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Natural Language Processing Techniques · Advanced Database Systems and Queries
