Texo: Formula Recognition within 20M Parameters
Sicheng Mao

TL;DR
Texo is a lightweight, high-performance formula recognition model with only 20 million parameters, achieving state-of-the-art accuracy while enabling real-time and in-browser deployment.
Contribution
The paper introduces Texo, a minimalist formula recognition model that reduces size significantly while maintaining high accuracy through attentive design and transfer learning.
Findings
Achieves comparable performance to larger models.
Reduces model size by 80% and 65%.
Enables real-time and in-browser deployment.
Abstract
In this paper we present Texo, a minimalist yet highperformance formula recognition model that contains only 20 million parameters. By attentive design, distillation and transfer of the vocabulary and the tokenizer, Texo achieves comparable performance to state-of-the-art models such as UniMERNet-T and PPFormulaNet-S, while reducing the model size by 80% and 65%, respectively. This enables real-time inference on consumer-grade hardware and even in-browser deployment. We also developed a web application to demonstrate the model capabilities and facilitate its usage for end users.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Mathematics, Computing, and Information Processing · Image and Object Detection Techniques
