HTR-ConvText: Leveraging Convolution and Textual Information for Handwritten Text Recognition
Pham Thach Thanh Truc, Dang Hoai Nam, Huynh Tong Dang Khoa, Vo Nguyen Le Duy

TL;DR
HTR-ConvText is a novel model that combines convolutional and textual features to improve handwritten text recognition, especially with limited data and diverse handwriting styles.
Contribution
The paper introduces HTR-ConvText, a hybrid architecture integrating CNN, MobileViT, and hierarchical encoding to enhance feature extraction and generalization in handwritten text recognition.
Findings
Achieves better accuracy on multiple datasets.
Improves generalization with limited training data.
Outperforms existing methods in complex handwriting scenarios.
Abstract
Handwritten Text Recognition remains challenging due to the limited data, high writing style variance, and scripts with complex diacritics. Existing approaches, though partially address these issues, often struggle to generalize without massive synthetic data. To address these challenges, we propose HTR-ConvText, a model designed to capture fine-grained, stroke-level local features while preserving global contextual dependencies. In the feature extraction stage, we integrate a residual Convolutional Neural Network backbone with a MobileViT with Positional Encoding block. This enables the model to both capture structural patterns and learn subtle writing details. We then introduce the ConvText encoder, a hybrid architecture combining global context and local features within a hierarchical structure that reduces sequence length for improved efficiency. Additionally, an auxiliary module…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Topic Modeling · Advanced Neural Network Applications
