PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer
Tongkun Guan, Chengyu Lin, Wei Shen, and Xiaokang Yang

TL;DR
PosFormer introduces a position forest transformer that explicitly models symbol positions and hierarchies in handwritten mathematical expressions, significantly improving recognition accuracy without extra computational cost.
Contribution
The paper presents a novel position forest transformer that jointly optimizes expression and position recognition, explicitly capturing structural relationships in handwritten math expressions.
Findings
Outperforms state-of-the-art methods on multiple datasets
Achieves over 2% accuracy improvement in recognition tasks
No additional latency or computational cost incurred
Abstract
Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios, such as digitized education and automated offices. Recently, sequence-based models with encoder-decoder architectures have been commonly adopted to address this task by directly predicting LaTeX sequences of expression images. However, these methods only implicitly learn the syntax rules provided by LaTeX, which may fail to describe the position and hierarchical relationship between symbols due to complex structural relations and diverse handwriting styles. To overcome this challenge, we propose a position forest transformer (PosFormer) for HMER, which jointly optimizes two tasks: expression recognition and position recognition, to explicitly enable position-aware symbol feature representation learning. Specifically, we first design a position forest that models the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Hand Gesture Recognition Systems
MethodsSoftmax · Attention Is All You Need
