HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge Distillation
Mohammed Hamdan, Abderrahmane Rahiche, Mohamed Cheriet

TL;DR
HTR-JAND is an advanced handwritten text recognition framework that combines joint attention mechanisms and knowledge distillation to improve accuracy and efficiency on historical documents, achieving state-of-the-art results.
Contribution
The paper introduces a novel HTR architecture integrating multi-stage training, advanced attention, and knowledge distillation for improved performance and model compression.
Findings
Achieves state-of-the-art CER on multiple datasets.
Reduces model size by 48% without performance loss.
Effective cross-dataset knowledge transfer.
Abstract
Despite significant advances in deep learning, current Handwritten Text Recognition (HTR) systems struggle with the inherent complexity of historical documents, including diverse writing styles, degraded text quality, and computational efficiency requirements across multiple languages and time periods. This paper introduces HTR-JAND (HTR-JAND: Handwritten Text Recognition with Joint Attention Network and Knowledge Distillation), an efficient HTR framework that combines advanced feature extraction with knowledge distillation. Our architecture incorporates three key components: (1) a CNN architecture integrating FullGatedConv2d layers with Squeeze-and-Excitation blocks for adaptive feature extraction, (2) a Combined Attention mechanism fusing Multi-Head Self-Attention with Proxima Attention for robust sequence modeling, and (3) a Knowledge Distillation framework enabling efficient model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Dense Connections · Byte Pair Encoding · Inverse Square Root Schedule · Residual Connection · Knowledge Distillation · Multi-Head Attention · Layer Normalization
