Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition
Lei Kang, Pau Riba, Mar\c{c}al Rusi\~nol, Alicia Forn\'es, Mauricio, Villegas

TL;DR
This paper introduces a non-recurrent transformer-based model for handwritten text-line recognition, achieving high accuracy without the limitations of sequential processing and enabling recognition of out-of-vocabulary words.
Contribution
The paper presents a novel transformer-based approach that replaces recurrent neural networks for handwritten text recognition, allowing parallel processing and out-of-vocabulary word recognition.
Findings
Significant accuracy improvements over prior methods
Effective recognition in few-shot learning scenarios
Ability to recognize out-of-vocabulary words
Abstract
The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections
