Semantic Representations of Mathematical Expressions in a Continuous Vector Space
Neeraj Gangwar, Nickvash Kani

TL;DR
This paper introduces a novel method for representing mathematical expressions as vectors using a sequence-to-sequence encoder trained on equivalent expressions, outperforming layout-based methods and providing a new corpus for research.
Contribution
It presents a new vector space embedding approach for mathematical expressions and compares it with structural methods, also releasing a corpus of equivalent expressions.
Findings
The sequence-to-sequence encoder better captures mathematical semantics.
The proposed method outperforms layout-based structural approaches.
A new corpus of equivalent mathematical expressions is published.
Abstract
Mathematical notation makes up a large portion of STEM literature, yet finding semantic representations for formulae remains a challenging problem. Because mathematical notation is precise, and its meaning changes significantly with small character shifts, the methods that work for natural text do not necessarily work well for mathematical expressions. This work describes an approach for representing mathematical expressions in a continuous vector space. We use the encoder of a sequence-to-sequence architecture, trained on visually different but mathematically equivalent expressions, to generate vector representations (or embeddings). We compare this approach with a structural approach that considers visual layout to embed an expression and show that our proposed approach is better at capturing mathematical semantics. Finally, to expedite future research, we publish a corpus of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Mathematics, Computing, and Information Processing · Mathematics Education and Teaching Techniques
