A Transformer-based Approach for Arabic Offline Handwritten Text   Recognition

Saleh Momeni; Bagher BabaAli

arXiv:2307.15045·cs.CV·July 28, 2023

A Transformer-based Approach for Arabic Offline Handwritten Text Recognition

Saleh Momeni, Bagher BabaAli

PDF

Open Access

TL;DR

This paper introduces Transformer-based architectures for offline Arabic handwritten text recognition, improving accuracy and parallelization over traditional RNN-based methods by leveraging attention mechanisms and pre-trained models.

Contribution

The paper proposes novel Transformer architectures for Arabic handwriting recognition, replacing RNNs, and demonstrates superior performance on benchmark datasets.

Findings

01

Outperforms state-of-the-art methods in accuracy

02

Offers faster processing due to parallelizable attention mechanisms

03

Effectively models language dependencies without external language models

Abstract

Handwriting recognition is a challenging and critical problem in the fields of pattern recognition and machine learning, with applications spanning a wide range of domains. In this paper, we focus on the specific issue of recognizing offline Arabic handwritten text. Existing approaches typically utilize a combination of convolutional neural networks for image feature extraction and recurrent neural networks for temporal modeling, with connectionist temporal classification used for text generation. However, these methods suffer from a lack of parallelization due to the sequential nature of recurrent neural networks. Furthermore, these models cannot account for linguistic rules, necessitating the use of an external language model in the post-processing stage to boost accuracy. To overcome these issues, we introduce two alternative architectures, namely the Transformer Transducer and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Natural Language Processing Techniques · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Byte Pair Encoding · Linear Layer · Softmax · Layer Normalization · Dense Connections · Dropout · Focus · Position-Wise Feed-Forward Layer