TL;DR
This paper presents an end-to-end deep learning-based OCR system specifically designed for recognizing handwritten Bengali words, utilizing CNN and RNN architectures to achieve low error rates.
Contribution
It introduces the first end-to-end OCR architecture for Bengali handwritten words, experimenting with multiple CNN and RNN models for optimal performance.
Findings
DenseNet121 with GRU achieves 0.091 character error rate.
The system achieves 0.273 word error rate.
The approach outperforms previous methods on the BanglaWritting dataset.
Abstract
Optical character recognition (OCR) is a process of converting analogue documents into digital using document images. Currently, many commercial and non-commercial OCR systems exist for both handwritten and printed copies for different languages. Despite this, very few works are available in case of recognising Bengali words. Among them, most of the works focused on OCR of printed Bengali characters. This paper introduces an end-to-end OCR system for Bengali language. The proposed architecture implements an end to end strategy that recognises handwritten Bengali words from handwritten word images. We experiment with popular convolutional neural network (CNN) architectures, including DenseNet, Xception, NASNet, and MobileNet to build the OCR architecture. Further, we experiment with two different recurrent neural networks (RNN) methods, LSTM and GRU. We evaluate the proposed architecture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDepthwise Convolution · Pointwise Convolution · Sigmoid Activation · Depthwise Separable Convolution · Max Pooling · Concatenated Skip Connection · Convolution · 1x1 Convolution · Average Pooling · Dense Connections
