Pay Attention to What You Read: Non-recurrent Handwritten Text-Line   Recognition

Lei Kang; Pau Riba; Mar\c{c}al Rusi\~nol; Alicia Forn\'es; Mauricio; Villegas

arXiv:2005.13044·cs.CV·May 28, 2020·20 cites

Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition

Lei Kang, Pau Riba, Mar\c{c}al Rusi\~nol, Alicia Forn\'es, Mauricio, Villegas

PDF

Open Access

TL;DR

This paper introduces a non-recurrent transformer-based model for handwritten text-line recognition, achieving high accuracy without the limitations of sequential processing and enabling recognition of out-of-vocabulary words.

Contribution

The paper presents a novel transformer-based approach that replaces recurrent neural networks for handwritten text recognition, allowing parallel processing and out-of-vocabulary word recognition.

Findings

01

Significant accuracy improvements over prior methods

02

Effective recognition in few-shot learning scenarios

03

Ability to recognize out-of-vocabulary words

Abstract

The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections