Devnagari document segmentation using histogram approach

Vikas J Dongre; Vijay H Mankar

arXiv:1109.1247·cs.CV·September 7, 2011·37 cites

Devnagari document segmentation using histogram approach

Vikas J Dongre, Vijay H Mankar

PDF

Open Access

TL;DR

This paper proposes a simple histogram-based method for segmenting Devnagari documents, addressing challenges posed by the script's complex structure to improve character recognition accuracy.

Contribution

It introduces a novel histogram approach specifically designed for Devnagari script segmentation, considering its unique features and challenges.

Findings

01

Effective segmentation of Devnagari documents achieved

02

Addresses challenges in script complexity and modifiers

03

Improves accuracy of subsequent character recognition

Abstract

Document segmentation is one of the critical phases in machine recognition of any language. Correct segmentation of individual symbols decides the accuracy of character recognition technique. It is used to decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents consist of vowels, consonants and various modifiers. Hence proper segmentation of Devnagari word is challenging. A simple histogram based approach to segment Devnagari documents is proposed in this paper. Various challenges in segmentation of Devnagari script are also discussed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHandwritten Text Recognition Techniques · Vehicle License Plate Recognition · Image Retrieval and Classification Techniques