Character-Based Handwritten Text Transcription with Attention Networks

Jason Poulos; Rafael Valle

arXiv:1712.04046·cs.CV·August 24, 2021

Character-Based Handwritten Text Transcription with Attention Networks

Jason Poulos, Rafael Valle

PDF

1 Repo

TL;DR

This paper investigates character-based handwritten text recognition using attention networks, comparing different attention mechanisms and demonstrating the importance of precise alignment for transcription accuracy.

Contribution

It introduces an analysis of softmax and sigmoid attention mechanisms in character-based HTR and highlights the impact of alignment precision on transcription performance.

Findings

01

Softmax attention provides more precise character alignment.

02

Sigmoid attention tends to focus on multiple characters, less precise.

03

Linear attention weights lead to poor performance due to lack of alignment.

Abstract

The paper approaches the task of handwritten text recognition (HTR) with attentional encoder-decoder networks trained on sequences of characters, rather than words. We experiment on lines of text from popular handwriting datasets and compare different activation functions for the attention mechanism used for aligning image pixels and target characters. We find that softmax attention focuses heavily on individual characters, while sigmoid attention focuses on multiple characters at each step of the decoding. When the sequence alignment is one-to-one, softmax attention is able to learn a more precise alignment at each step of the decoding, whereas the alignment generated by sigmoid attention is much less precise. When a linear function is used to obtain attention weights, the model predicts a character by looking at the entire sequence of characters and performs poorly because it lacks a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jvpoulos/Attention-OCR
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax