Multi-Head Attention based interaction-aware architecture for Bangla Handwritten Character Recognition: Introducing a Primary Dataset

Mirza Raquib; Asif Pervez Polok; Kedar Nath Biswas; Farida Siddiqi Prity; Saydul Akbar Murad; Nick Rahimi

arXiv:2604.09717·cs.CV·April 14, 2026

Multi-Head Attention based interaction-aware architecture for Bangla Handwritten Character Recognition: Introducing a Primary Dataset

Mirza Raquib, Asif Pervez Polok, Kedar Nath Biswas, Farida Siddiqi Prity, Saydul Akbar Murad, Nick Rahimi

PDF

1 Repo

TL;DR

This paper introduces a new balanced dataset for Bangla handwritten characters and proposes a hybrid deep learning model with multi-head attention for improved recognition accuracy.

Contribution

It presents a novel interaction-aware architecture combining EfficientNetB3, Vision Transformer, and Conformer modules with a multi-head cross-attention mechanism, along with a new dataset.

Findings

01

Achieved 98.84% accuracy on the new dataset

02

Achieved 96.49% accuracy on external benchmark

03

Demonstrated strong generalization and interpretability

Abstract

Character recognition is the fundamental part of an optical character recognition (OCR) system. Word recognition, sentence transcription, document digitization, and language processing are some of the higher-order activities that can be done accurately through character recognition. Nonetheless, recognizing handwritten Bangla characters is not an easy task because they are written in different styles with inconsistent stroke patterns and a high degree of visual character resemblance. The datasets available are usually limited in intra-class and inequitable in class distribution. We have constructed a new balanced dataset of Bangla written characters to overcome those problems. This consists of 78 classes and each class has approximately 650 samples. It contains the basic characters, composite (Juktobarno) characters and numerals. The samples were a diverse group comprising a large age…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/MIRZARAQUIB/Bangla_Handwritten_Character_Recognition
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.