Mathematical Derivation Graphs: A Relation Extraction Task in STEM Manuscripts
Vishesh Prasad, Brian Kim, Nickvash Kani

TL;DR
This paper introduces the Mathematical Derivation Graphs Dataset to analyze mathematical relationships in STEM texts, evaluating NLP and ML models' ability to extract derivation dependencies with moderate success.
Contribution
It presents the first dataset and evaluation framework for extracting derivation relationships between equations in STEM articles, expanding NLP applications to mathematical content.
Findings
Best LLMs achieve 45-52% F1 score in relation extraction
Combining LLMs with analytical methods shows potential for improvement
The dataset enables future research in mathematical relation extraction
Abstract
Recent advances in natural language processing (NLP), particularly with the emergence of large language models (LLMs), have significantly enhanced the field of textual analysis. However, while these developments have yielded substantial progress in analyzing natural language text, applying analysis to mathematical equations and their relationships within texts has produced mixed results. This paper takes the initial steps in expanding the problem of relation extraction towards understanding the dependency relationships between mathematical expressions in STEM articles. The authors construct the Mathematical Derivation Graphs Dataset (MDGD), sourced from a random sampling of the arXiv corpus, containing an analysis of published STEM manuscripts with over manually labeled inter-equation dependency relationships, resulting in a new object referred to as a derivation graph that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Intelligent Tutoring Systems and Adaptive Learning · Computational Physics and Python Applications
