MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data

Renqing Luo; Yuhan Xu

arXiv:2406.17148·cs.CV·July 12, 2024

MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data

Renqing Luo, Yuhan Xu

PDF

Open Access 1 Repo

TL;DR

MixTex is a LaTeX OCR model that uses Transformer architectures and a novel data augmentation technique to reduce bias and improve recognition accuracy in multilingual and ambiguous scenarios.

Contribution

The paper introduces MixTex, combining Swin Transformer and RoBERTa, with a new data augmentation method to mitigate bias in LaTeX OCR tasks.

Findings

01

Significantly reduces bias in recognition tasks.

02

Adheres strictly to image content in clear, unambiguous cases.

03

Applicable to other disambiguation recognition tasks.

Abstract

This paper introduces MixTex, an end-to-end LaTeX OCR model designed for low-bias multilingual recognition, along with its novel data collection method. In applying Transformer architectures to LaTeX text recognition, we identified specific bias issues, such as the frequent misinterpretation of $e - t$ as $e^{- t}$ . We attribute this bias to the characteristics of the arXiv dataset commonly used for training. To mitigate this bias, we propose an innovative data augmentation method. This approach introduces controlled noise into the recognition targets by blending genuine text with pseudo-text and incorporating a small proportion of disruptive characters. We further suggest that this method has broader applicability to various disambiguation recognition tasks, including the accurate identification of erroneous notes in musical performances. MixTex's architecture leverages the Swin…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RQLuo/MixTeX
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInterpreting and Communication in Healthcare · Second Language Learning and Teaching

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay · Stochastic Depth · Attention Dropout · BERT · Softmax · RoBERTa · Layer Normalization