Texo: Formula Recognition within 20M Parameters

Sicheng Mao

arXiv:2602.17189·cs.AI·February 20, 2026

Texo: Formula Recognition within 20M Parameters

Sicheng Mao

PDF

Open Access

TL;DR

Texo is a lightweight, high-performance formula recognition model with only 20 million parameters, achieving state-of-the-art accuracy while enabling real-time and in-browser deployment.

Contribution

The paper introduces Texo, a minimalist formula recognition model that reduces size significantly while maintaining high accuracy through attentive design and transfer learning.

Findings

01

Achieves comparable performance to larger models.

02

Reduces model size by 80% and 65%.

03

Enables real-time and in-browser deployment.

Abstract

In this paper we present Texo, a minimalist yet highperformance formula recognition model that contains only 20 million parameters. By attentive design, distillation and transfer of the vocabulary and the tokenizer, Texo achieves comparable performance to state-of-the-art models such as UniMERNet-T and PPFormulaNet-S, while reducing the model size by 80% and 65%, respectively. This enables real-time inference on consumer-grade hardware and even in-browser deployment. We also developed a web application to demonstrate the model capabilities and facilitate its usage for end users.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Mathematics, Computing, and Information Processing · Image and Object Detection Techniques