Maximizing Generalization: The Effect of Different Augmentation Techniques on Lightweight Vision Transformer for Bengali Character Classification

Rafi Hassan Chowdhury; Naimul Haque; Kaniz Fatiha

arXiv:2603.02591·cs.CV·March 4, 2026

Maximizing Generalization: The Effect of Different Augmentation Techniques on Lightweight Vision Transformer for Bengali Character Classification

Rafi Hassan Chowdhury, Naimul Haque, Kaniz Fatiha

PDF

Open Access

TL;DR

This paper investigates how various image augmentation techniques affect the performance of a lightweight vision transformer in recognizing Bengali handwritten characters, demonstrating that specific combinations significantly improve accuracy in data-scarce scenarios.

Contribution

It provides an in-depth analysis of augmentation techniques for Bengali character classification using lightweight models, highlighting the most effective combinations for limited data environments.

Findings

01

Random Affine and Color Jitter combination achieved over 97.5% accuracy.

02

Augmentation techniques significantly improve model performance in resource-limited language datasets.

03

Lightweight EfficientViT model benefits from specific augmentation strategies for better generalization.

Abstract

Deep learning models have proven to be highly effective in computer vision, with deep convolutional neural networks achieving impressive results across various computer vision tasks. However, these models rely heavily on large datasets to avoid overfitting. When a model learns features with either low or high variance, it can lead to underfitting or overfitting on the training data. Unfortunately, large-scale datasets may not be available in many domains, particularly for resource-limited languages such as Bengali. In this experiment, a series of tests were conducted in the field of image data augmentation as an approach to addressing the limited data problem for Bengali handwritten characters. The study also provides an in-depth analysis of the performance of different augmentation techniques. Data augmentation refers to a set of techniques applied to data to increase its size and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Medical Image Segmentation Techniques