Discovery and Recognition of Formula Concepts using Machine Learning
Philipp Scharpf, Moritz Schubotz, Howard S. Cohl, Corinna, Breitinger, Bela Gipp

TL;DR
This paper introduces a machine learning framework for discovering and recognizing mathematical formula concepts, enabling semantic search and improved information retrieval in scientific documents.
Contribution
It defines the tasks of Formula Concept Discovery and Recognition and proposes ML-based methods evaluated on a standard dataset, advancing citation and retrieval of mathematical formulas.
Findings
FCD achieves 68% precision in retrieving equivalent formulas
FCD attains 72% recall in extracting formula names
Methods facilitate semantic search and plagiarism detection
Abstract
Citation-based Information Retrieval (IR) methods for scientific documents have proven effective for IR applications, such as Plagiarism Detection or Literature Recommender Systems in academic disciplines that use many references. In science, technology, engineering, and mathematics, researchers often employ mathematical concepts through formula notation to refer to prior knowledge. Our long-term goal is to generalize citation-based IR methods and apply this generalized method to both classical references and mathematical concepts. In this paper, we suggest how mathematical formulas could be cited and define a Formula Concept Retrieval task with two subtasks: Formula Concept Discovery (FCD) and Formula Concept Recognition (FCR). While FCD aims at the definition and exploration of a 'Formula Concept' that names bundled equivalent representations of a formula, FCR is designed to match a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Advanced Text Analysis Techniques
MethodsTest
