A Transformer Based Handwriting Recognition System Jointly Using Online and Offline Features
Ayush Lodh, Ritabrata Chakraborty, Shivakumara Palaiahnakote, Umapada Pal

TL;DR
This paper introduces a novel end-to-end transformer-based handwriting recognition system that fuses online stroke data and offline images early in the process, leading to improved accuracy and writer independence.
Contribution
The work presents a new joint online-offline handwriting recognition model using early fusion in a shared latent space with transformers, achieving state-of-the-art results.
Findings
Achieves up to 1% accuracy improvement over previous bests.
Demonstrates strong writer independence in recognition.
Successfully adapts to gesturification on the ISI-Air dataset.
Abstract
We posit that handwriting recognition benefits from complementary cues carried by the rasterized complex glyph and the pen's trajectory, yet most systems exploit only one modality. We introduce an end-to-end network that performs early fusion of offline images and online stroke data within a shared latent space. A patch encoder converts the grayscale crop into fixed-length visual tokens, while a lightweight transformer embeds the sequence. Learnable latent queries attend jointly to both token streams, yielding context-enhanced stroke embeddings that are pooled and decoded under a cross-entropy loss objective. Because integration occurs before any high-level classification, temporal cues reinforce each other during representation learning, producing stronger writer independence. Comprehensive experiments on IAMOn-DB and VNOn-DB demonstrate that our approach achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Hand Gesture Recognition Systems · Vehicle License Plate Recognition
