AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
Dmitrijs Kass, Ekta Vats

TL;DR
This paper introduces an attention-based encoder-decoder model for handwritten text recognition that leverages transfer learning from scene text models to improve data efficiency, validated on multiple datasets.
Contribution
It presents a novel end-to-end HTR system using transfer learning from scene text models, combining ResNet and bidirectional LSTM with attention mechanisms.
Findings
Effective on multi-writer dataset Imgur5K
Improved recognition accuracy over baseline models
Provides open-source code and pre-trained models
Abstract
This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · 1x1 Convolution · Global Average Pooling · Residual Block · Batch Normalization · Residual Connection · Bottleneck Residual Block · Convolution · Max Pooling
