MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
Renqing Luo, Yuhan Xu

TL;DR
MixTex is a LaTeX OCR model that uses Transformer architectures and a novel data augmentation technique to reduce bias and improve recognition accuracy in multilingual and ambiguous scenarios.
Contribution
The paper introduces MixTex, combining Swin Transformer and RoBERTa, with a new data augmentation method to mitigate bias in LaTeX OCR tasks.
Findings
Significantly reduces bias in recognition tasks.
Adheres strictly to image content in clear, unambiguous cases.
Applicable to other disambiguation recognition tasks.
Abstract
This paper introduces MixTex, an end-to-end LaTeX OCR model designed for low-bias multilingual recognition, along with its novel data collection method. In applying Transformer architectures to LaTeX text recognition, we identified specific bias issues, such as the frequent misinterpretation of as . We attribute this bias to the characteristics of the arXiv dataset commonly used for training. To mitigate this bias, we propose an innovative data augmentation method. This approach introduces controlled noise into the recognition targets by blending genuine text with pseudo-text and incorporating a small proportion of disruptive characters. We further suggest that this method has broader applicability to various disambiguation recognition tasks, including the accurate identification of erroneous notes in musical performances. MixTex's architecture leverages the Swin…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterpreting and Communication in Healthcare · Second Language Learning and Teaching
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay · Stochastic Depth · Attention Dropout · BERT · Softmax · RoBERTa · Layer Normalization
