Maximizing Generalization: The Effect of Different Augmentation Techniques on Lightweight Vision Transformer for Bengali Character Classification
Rafi Hassan Chowdhury, Naimul Haque, Kaniz Fatiha

TL;DR
This paper investigates how various image augmentation techniques affect the performance of a lightweight vision transformer in recognizing Bengali handwritten characters, demonstrating that specific combinations significantly improve accuracy in data-scarce scenarios.
Contribution
It provides an in-depth analysis of augmentation techniques for Bengali character classification using lightweight models, highlighting the most effective combinations for limited data environments.
Findings
Random Affine and Color Jitter combination achieved over 97.5% accuracy.
Augmentation techniques significantly improve model performance in resource-limited language datasets.
Lightweight EfficientViT model benefits from specific augmentation strategies for better generalization.
Abstract
Deep learning models have proven to be highly effective in computer vision, with deep convolutional neural networks achieving impressive results across various computer vision tasks. However, these models rely heavily on large datasets to avoid overfitting. When a model learns features with either low or high variance, it can lead to underfitting or overfitting on the training data. Unfortunately, large-scale datasets may not be available in many domains, particularly for resource-limited languages such as Bengali. In this experiment, a series of tests were conducted in the field of image data augmentation as an approach to addressing the limited data problem for Bengali handwritten characters. The study also provides an in-depth analysis of the performance of different augmentation techniques. Data augmentation refers to a set of techniques applied to data to increase its size and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Neural Network Applications · Medical Image Segmentation Techniques
