Character Queries: A Transformer-based Approach to On-Line Handwritten   Character Segmentation

Michael Jungo; Beat Wolf; Andrii Maksai; Claudiu Musat; Andreas; Fischer

arXiv:2309.03072·cs.CV·September 7, 2023

Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation

Michael Jungo, Beat Wolf, Andrii Maksai, Claudiu Musat, Andreas, Fischer

PDF

1 Repo

TL;DR

This paper introduces a Transformer-based method for online handwritten character segmentation that leverages known transcriptions to improve segmentation accuracy, outperforming existing methods on standard datasets.

Contribution

It presents a novel Transformer architecture with learned character queries for precise segmentation, decoupling segmentation from recognition and addressing the assignment problem.

Findings

01

Achieves state-of-the-art segmentation results on IAM-OnDB and HANDS-VNOnDB datasets.

02

Demonstrates the effectiveness of Transformer-based cluster assignment in handwriting segmentation.

03

Provides new ground truth data for evaluation of online handwritten character segmentation.

Abstract

On-line handwritten character segmentation is often associated with handwriting recognition and even though recognition models include mechanisms to locate relevant positions during the recognition process, it is typically insufficient to produce a precise segmentation. Decoupling the segmentation from the recognition unlocks the potential to further utilize the result of the recognition. We specifically focus on the scenario where the transcription is known beforehand, in which case the character segmentation becomes an assignment problem between sampling points of the stylus trajectory and characters in the text. Inspired by the $k$ -means clustering algorithm, we view it from the perspective of cluster assignment and present a Transformer-based architecture where each cluster is formed based on a learned character query in the Transformer decoder block. In order to assess the quality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jungomi/character-queries
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Byte Pair Encoding · Label Smoothing · Dropout · Absolute Position Encodings · Layer Normalization · Adam